### Finish part of lecture about parameter esimation

parent 85ff8570
 ... ... @@ -20,8 +20,43 @@ \begin{outline} \1 From last time \1 Today: How well does your data fit a line? \2 More complicated regressions exist, of course, but we'll stick with this one for now \2 Eyeballing is just not rigorous enough \1 Basic model: $y_i = b_0 + b_1x_i + e_i$ \2 $y_i$ is the prediction \2 $b_0$ is the y-intercept \2 $b_1$ is the slope \2 $x_i$ is the predictor \2 $e_i$ is the error \2 \textit{Which of these are random variables?} \3 A: All but $x_i$ the $b$s are estimated from random variables, $e$ is difference between random variables \3 So, we can compute statistics on them \1 Two criteria for getting $b$s \2 Zero total error \2 Minimize SSE (sum of squared errors) \2 Example of why one is not enough: two points, infinite lines with zero total error \2 Squared errors always positive, so this criterion alone could overshoot or undershoot \1 Deriving $b_0$ is easy \2 Solve for $e_i$: $y_i - (b_0 + b_i x_i)$ \2 Take the mean over all $i$: $\overline{x} = \overline{y} - b_0 - b_1 \overline{x}$ \2 Set mean error to 0 to get $b_0 = \overline{y} - b_1 \overline{x}$ \2 Now we just need $b_1$ \1 Deriving $b_1$ is harder \2 SSE = sum of errors squared over all $i$ \2 We want a minimum value for this \2 It's a function with one local maximum \2 So we can differentiate and look for zero \2 $s_y^2 - 2b_1s^2_{xy} + b_1^2s_x^2$, then take derivative \2 $s_{xy}$ is correlation coefficient of $x$ and $y$ (see p. 181) \2 In the end, gives us $b_1 = \frac{s^2_{xy}}{s_x^2}$ \3 Correlation of $x$ and $y$ divided by variance of $x$ \3 $\frac{\sum{xy} - n \overline{x} \overline{y}}{\sum{x^2} - n(\overline{x})^2}$ \1 For next time \end{outline} ... ...
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!