# statistics

posted by .

I'm trying to work through the proof for
SST = SSM + SSE

MEAN = ∑(X)/N

SST = ∑((x - MEAN)^2)
= ∑(x^2 - 2 * x1 * MEAN + MEAN^2)
= ∑(x^2) - 2 * MEAN * ∑(x) + N * MEAN^2
= ∑(x^2) - 2 * ∑(x)^2/N + ∑(x)^2/N
= ∑(x^2) - ∑(x)^2/N

SSM = ∑((MODEL - MEAN)^2)
= ∑(MODEL^2 - 2 * MODEL * MEAN + MEAN^2)
= ∑(MODEL^2) - 2 * MEAN * ∑(MODEL) + N * MEAN^2
= ∑(MODEL^2) - 2 * MEAN * ∑(MODEL) + N * MEAN^2
= ∑(MODEL^2) - 2/N * ∑(x) * ∑(MODEL) + ∑(x)^2/N

SSE = ∑((x - MODEL)^2)
= ∑(x^2 - 2 * x * MODEL + MODEL^2)
= ∑(x^2) - 2 * ∑(x * MODEL) + ∑(MODEL^2)

SST = SSM + SSE
∑(x^2) - ∑(x)^2/N = ∑(MODEL^2) - 2/N * ∑(x) * ∑(MODEL) + ∑(x)^2/N + ∑(x^2) - 2 * ∑(x * MODEL) + ∑(MODEL^2)
2 * ∑(MODEL^2) - 2/N * ∑(x) * ∑(MODEL) + 2 * ∑(x)^2/N + - 2 * ∑(x * MODEL) = 0
∑(MODEL^2) - 1/N * ∑(x) * ∑(MODEL) + ∑(x)^2/N + - ∑(x * MODEL) = 0

I can't complete the proof. What am I missing? Thanks!

• statistics -

Divide the sum of squares by N and work with the averages. Let's use the notation:

<X> for the average of X. E.g.:

<X> = ∑(X)/N = Mean

And:

<(X-<X>)^2> =

<X^2 - 2X<X> + <X>^2> =

<X^2> - <X>^2

Note that <a X> = a <X> for a constant factor a. In an average like <X <X>>, the inner <X> is a constant when carrying out the outer average, so you can take it out of the outer average sign. So, you have <X <X>> = <X>^2. The average of a constant is, of course, the same constant so e.g. <<X>^2> = <X^2> because once the inner average is carried out it is a constant w.r.t. the outer average.

If you work with averages and use these rules then you can derive the desired result in just one line. If you use summations, you'll tend to re-derive these rules in every step you make, so you'll get a complicated mess.

Derivation:

<(X - <X>)^2> =

<(X - m + m - <X>)^2> =

<(X-m)^2> + <(m - <X>)^2>

+ 2 <X-m><m-<X>>

The last term is zero if the average of X equals the average of the Model.

• statistics -

<<X>^2> = <X^2>

Of course, if k is constant and x is variable:

<kx> = k<x>
<k> = k
<k^2> = k^2

but...

<x^2> != <x>^2

• statistics -

Sorry, that was a typo.

I meant to write:

<<X>^2> = <X>^2

• statistics -

I don't follow this at all:
<(X - m + m - <X>)^2> = <(X-m)^2> + <(m - <X>)^2> + 2 <X-m><m-<X>>

<(X - m + m - <X>)^2>
= <(X - <X>)^2>
= <x^2> - <x>^2
= SST

For SSM + SSE:

<(x - m>^2> + <(m - <x>)^2>
= <x^2 - 2xm + m^2> + <m^2 - 2m<x> + <x>^2>
= <x^2> - 2<xm> + <m^2> + <m^2> - 2<m><x> + <x>^2>
= 2<x^2> + 2<m^2> - 2<xm> - 2<m><x>

And I'm stuck...