Previously I made a post about diagonalizing a positive definite symmetric matrix . One of the applications of this within statistics is to diagonalize the variance matrix of a multivariate normal in order to derive conditional distributions. Let
Consider multiplying Y by the following matrix
i.e.
The covariance matrix has now been diagonalized. This is useful because zero covariance implies independence for normally distributed random variables and so it follows that
Conditiong on Y2 it follows that the conditional distribution is
Notice how the variance of the conditional normal distribution is the marginal variance of Y1 minus something. That is to say the variance of Y1 is reduced given the knowledge of Y2. The variance can be recognized as the Schur complement of the covariance matrix with respect to V22. A similar treatment yields