CTK Exchange
Front Page
Movie shortcuts
Personal info
Awards
Reciprocal links
Terms of use
Privacy Policy

Interactive Activities

Cut The Knot!
MSET99 Talk
Games & Puzzles
Arithmetic/Algebra
Geometry
Probability
Eye Opener
Analog Gadgets
Inventor's Paradox
Did you know?...
Proofs
Math as Language
Things Impossible
My Logo
Math Poll
Other Math sit's
Guest book
News sit's

Recommend this site

Manifesto: what CTK is about Search CTK Buying a book is a commitment to learning Table of content Things you can find on CTK Chronology of updates Email to Cut The Knot Recommend this page

CTK Exchange

Subject: "Linear MMSE"     Previous Topic | Next Topic
Printer-friendly copy     Email this topic to a friend    
Conferences The CTK Exchange College math Topic #568
Reading Topic #568
Ignorant
guest
Apr-11-06, 03:38 PM (EST)
 
"Linear MMSE"
 
   Suppose
Y=AX+B

Y, B M x 1 vector
X N x 1 vector
A M x N Matrix


X and B are random vectors of some distribution.

The linear estimate of X given a sample of Y=y is

inverse(R_{YY})R_{XY} y.

I searched online for the proof, but in vain.

The problem is to minimise E<||TY-X||> w r t T, where T is the linear MMSE estimate. For scalars the problem is easy to solve, However for vectors, one needs vector calculus to approach the problem. Can anybody help me in figuring out this problem.


  Alert | IP Printer-friendly page | Reply | Reply With Quote | Top
mr_homm
Member since May-22-05
Apr-25-06, 04:18 PM (EST)
Click to EMail mr_homm Click to send private message to mr_homm Click to view user profileClick to add this user to your buddy list  
1. "RE: Linear MMSE"
In response to message #0
 
   Hi,

I've been intending to answer this post, but I have been very busy. In case there is still any interest in this question, here is a proof:

>Suppose
>Y=AX+B
>
>Y, B M x 1 vector
>X N x 1 vector
>A M x N Matrix
>

I think that this is best handled with index notation. Let the matrices A and T have entries
Ajk and Tjk ,
and the vectors X, Y, and B have entries
Xj, Yj, Bj .
To keep the notation clean, let's use the Einstein summation convention (if any index is repeated in a product expression, once in the upper and once in the lower position, the expression is implicitly summed with the repeated index as the dummy variable of the sum). Then your equation becomes

Yj = AjkXk + Bj.

A note about the order of the indices: when turning indexed expressions back into matrix expressions, the upper index is the row index, and the lower is the column index. But for matrices, the first index is the row and the second is the column. Therefore, to keep the indexed expressions consistent with standard matrix notation, the index order should look like
M1ijM2jk = (M1M2)ik ,
where the second (which must be lower) index of the left matrix matches the first (which must be upper) index of the right matrix. This is the index that is summed over in standard matrix multiplication.
>

>X and B are random vectors of some distribution.
>
>The linear estimate of X given a sample of Y=y is
>
>inverse(R_{YY})R_{XY} y.
>
>I searched online for the proof, but in vain.
>

This notation R_{YY} is unfamiliar to me. It is some kind of matrix
constructed from the probabilities of Y, but how? If the derivation below works out, we'll have an estimator for X, and then I can try to see how that relates to R.

>The problem is to minimise E<||TY-X||> w r t T, where T is
>the linear MMSE estimate. For scalars the problem is easy to
>solve, However for vectors, one needs vector calculus to
>approach the problem. Can anybody help me in figuring out
>this problem.

The expectation you wish to minimize, written in index notation, is
E((TijYj - Xi)(TikYk - Xi)) .
Notice the position of the indices on the left factor. Upper and lower are reversed, because this corresponds to forming the transpose in the common matrix notation. However, the first index remains first, and the second index remains second. This is important, because swapping them would cause the same actual element of the array to now have different indices, confusing the whole calculation.

Let dlm stand for differentiation with respect to Tlm. Then since E is linear, the differentiation comes inside the E operator and applies to the products of variables it finds there. Since these are just lots of products of scalars, the product rule applies (index notation makes this work very nicely, since matrix and vector operations are displayed as organized patterns of scalar multiplication, where the action of differentiation is obvious).

Also, since the entries of T are independent variables, the differentiation gives zero for every entry except the one containing the variable you are differentiating on. Therefore,
dlmTij = deltaildeltamj,
or in other words, the derivative is zero unless i and j match l and m exactly. For a transposed matrix, you get
dlmTij = deltaildeltamj.

Note that this is not a matrix product, but a set of derivative operations indexed by l,m operating on a set of variables indexed by i,j. As such, it is actually a tensor product of an operator and a matrix, and produces an array of values with 4 indices -- a 4-tensor. When both indices are upper or both lower, you get something that doesn't correspond well to the ordinary matrix notation. This is one of the ways in which index notation is more flexible. However, the final formula must always reduce to one upper and one lower index because this is a matrix computation to start with.

Recall also the following property of delta:
deltajkXk = Xj .
This means that your minimization equation reduces like this:
0 = dlmE((TijYj - Xi)(TikYk - Xi))
= E(dlm(TijYj - Xi)(TikYk - Xi))
= E((deltalideltamjYj)(TikYk - Xi) + (TijYj - Xi)(deltaildeltamkYk))
= 2E((deltalideltamjYj)(TikYk - Xi))
= 2E((Ym)(TlkYk - Xl))

Therefore
E(YmTlkYk) = E(YmXl).
Now the indices are in funny places here, but we can raise and lower them in pairs, using the identity
deltaijdeltaji = deltaii = 1. Therefore
YmTlkYk = YmdeltakjdeltajkTlkYk = YmTljYj .
Since E is linear, this gives
TljE(YmYj) = E(YmXl).
Now this appears to tell me what the definitions of the R's must be:
(R_XY)lm = E(XlYm) ,
and putting Y in place of X,
(R_YY)lm = E(YlYm) .

Therefore, the formula becomes
Tlj(R_YY)jk = (R_XY)lm .
Everything here is in transposed form, so transposing both sides and putting the multiplication in the standard matrix order gives
(R_YY)jkTlj = (R_XY)lm ,
which in standard matrix order now, so we can drop the index notation and just write
(R_YY)(T) = R_XY ,
from which it is obvious that
T = (R_YY)-1R_XY ,
so that
xestimated = Ty = (R_YY)-1R_XYy ,
which completes the proof.

As you can see, the main thing to remember with index notation is to be very, very careful to keep thinks in the right places. If you do that, then everything works out just as if you were working with scalars.

I hope this helps, and is still of interest.

--Stuart Anderson


  Alert | IP Printer-friendly page | Reply | Reply With Quote | Top

Conferences | Forums | Topics | Previous Topic | Next Topic

You may be curious to have a look at the old CTK Exchange archive.
Please do not post there.

|Front page| |Contents|

Copyright © 1996-2018 Alexander Bogomolny

Search:
Keywords:

Google
Web CTK