statistics

posted by .

I'm trying to work through the proof for
SST = SSM + SSE

MEAN = ∑(X)/N

SST = ∑((x - MEAN)^2)
= ∑(x^2 - 2 * x1 * MEAN + MEAN^2)
= ∑(x^2) - 2 * MEAN * ∑(x) + N * MEAN^2
= ∑(x^2) - 2 * ∑(x)^2/N + ∑(x)^2/N
= ∑(x^2) - ∑(x)^2/N

SSM = ∑((MODEL - MEAN)^2)
= ∑(MODEL^2 - 2 * MODEL * MEAN + MEAN^2)
= ∑(MODEL^2) - 2 * MEAN * ∑(MODEL) + N * MEAN^2
= ∑(MODEL^2) - 2 * MEAN * ∑(MODEL) + N * MEAN^2
= ∑(MODEL^2) - 2/N * ∑(x) * ∑(MODEL) + ∑(x)^2/N

SSE = ∑((x - MODEL)^2)
= ∑(x^2 - 2 * x * MODEL + MODEL^2)
= ∑(x^2) - 2 * ∑(x * MODEL) + ∑(MODEL^2)

SST = SSM + SSE
∑(x^2) - ∑(x)^2/N = ∑(MODEL^2) - 2/N * ∑(x) * ∑(MODEL) + ∑(x)^2/N + ∑(x^2) - 2 * ∑(x * MODEL) + ∑(MODEL^2)
2 * ∑(MODEL^2) - 2/N * ∑(x) * ∑(MODEL) + 2 * ∑(x)^2/N + - 2 * ∑(x * MODEL) = 0
∑(MODEL^2) - 1/N * ∑(x) * ∑(MODEL) + ∑(x)^2/N + - ∑(x * MODEL) = 0

I can't complete the proof. What am I missing? Thanks!

  • statistics -

    Divide the sum of squares by N and work with the averages. Let's use the notation:

    <X> for the average of X. E.g.:

    <X> = ∑(X)/N = Mean

    And:

    <(X-<X>)^2> =

    <X^2 - 2X<X> + <X>^2> =

    <X^2> - <X>^2

    Note that <a X> = a <X> for a constant factor a. In an average like <X <X>>, the inner <X> is a constant when carrying out the outer average, so you can take it out of the outer average sign. So, you have <X <X>> = <X>^2. The average of a constant is, of course, the same constant so e.g. <<X>^2> = <X^2> because once the inner average is carried out it is a constant w.r.t. the outer average.

    If you work with averages and use these rules then you can derive the desired result in just one line. If you use summations, you'll tend to re-derive these rules in every step you make, so you'll get a complicated mess.

    Derivation:


    <(X - <X>)^2> =

    <(X - m + m - <X>)^2> =

    <(X-m)^2> + <(m - <X>)^2>

    + 2 <X-m><m-<X>>

    The last term is zero if the average of X equals the average of the Model.

  • statistics -

    I don't follow this:

    <<X>^2> = <X^2>

    Of course, if k is constant and x is variable:

    <kx> = k<x>
    <k> = k
    <k^2> = k^2

    but...

    <x^2> != <x>^2

  • statistics -

    Sorry, that was a typo.

    I meant to write:

    <<X>^2> = <X>^2

  • statistics -

    I don't follow this at all:
    <(X - m + m - <X>)^2> = <(X-m)^2> + <(m - <X>)^2> + 2 <X-m><m-<X>>

    Trying to follow your logic, for the left side:

    <(X - m + m - <X>)^2>
    = <(X - <X>)^2>
    = <x^2> - <x>^2
    = SST

    For SSM + SSE:

    <(x - m>^2> + <(m - <x>)^2>
    = <x^2 - 2xm + m^2> + <m^2 - 2m<x> + <x>^2>
    = <x^2> - 2<xm> + <m^2> + <m^2> - 2<m><x> + <x>^2>
    = 2<x^2> + 2<m^2> - 2<xm> - 2<m><x>

    And I'm stuck...

Respond to this Question

First Name
School Subject
Your Answer

Similar Questions

  1. Statistics

    Where can I find a proof for: SST = SSM + SSE
  2. statistics

    I have a simple set of 10 data points My ten Data Points 2 3 3 4 5 8 9 11 11 13 (mean = 6.9) My prediction nodel predicts the following values for the ten data points (listed in same order) 2 3 4 5 6 7 9 10 11 12 I calculate SST = …
  3. Calculus

    Find a series ∑a_n for which ∑(a_n)^2 converges but ∑|a_n| diverges
  4. Calculus

    If a_n >0 and b_n >0 and series ∑ sqrt( (a_n)^2 +(b_n)^2 ) converges, then ∑a_n and ∑b_n both converge. True or false?
  5. Math - Mathematical Induction

    3. Prove by induction that∑_(r=1)^n▒〖r(r+4)=1/6 n(n+1)(2n+13)〗. 5. It is given that u_1=1 and u_(n+1)=3u_n+2n-2 where n is a positive integer. Prove, by induction, that u_n=3^n/2-n+1/2. 14. The rth term of …
  6. Statistics

    3. The formula for finding sample standard deviation is ________________. a.𝑠=∑1▒𝑥−𝑥 ̅^2 b.𝜎^2=(∑1▒(𝑋−𝜇)^2 )/𝑁 c.𝑠^2=(∑1▒(𝑋−𝜇)^2 …
  7. Probability

    Let N be a geometric r.v. with mean 1/p; let A1,A2,… be a sequence of i.i.d. random variables, all independent of N, with mean 1 and variance 1; let B1,B2,… be another sequence of i.i.d. random variable, all independent of N and …
  8. Probability

    Let N,X1,Y1,X2,Y2,… be independent random variables. The random variable N takes positive integer values and has mean a and variance r. The random variables Xi are independent and identically distributed with mean b and variance …
  9. Math

    ∑(x+y) c. ∑(x+∑(y)) d. ∑x+ ∑y e. ∑(x)+ ∑(y)* what do each of these mean?
  10. Probability

    Let N,X1,Y1,X2,Y2,… be independent random variables. The random variable N takes positive integer values and has mean a and variance r. The random variables Xi are independent and identically distributed with mean b and variance …

More Similar Questions