Commit b9642f00 authored by Robert Ricci's avatar Robert Ricci

Finish part of lecture about parameter esimation

parent 85ff8570
......@@ -20,8 +20,43 @@
\begin{outline}
\1 From last time
\1 Today: How well does your data fit a line?
\2 More complicated regressions exist, of course, but we'll stick with this one for now
\2 Eyeballing is just not rigorous enough
\1 Basic model: $y_i = b_0 + b_1x_i + e_i$
\2 $y_i$ is the prediction
\2 $b_0$ is the y-intercept
\2 $b_1$ is the slope
\2 $x_i$ is the predictor
\2 $e_i$ is the error
\2 \textit{Which of these are random variables?}
\3 A: All but $x_i$ the $b$s are estimated from random variables, $e$ is difference between random variables
\3 So, we can compute statistics on them
\1 Two criteria for getting $b$s
\2 Zero total error
\2 Minimize SSE (sum of squared errors)
\2 Example of why one is not enough: two points, infinite lines with zero total error
\2 Squared errors always positive, so this criterion alone could overshoot
or undershoot
\1 Deriving $b_0$ is easy
\2 Solve for $e_i$: $y_i - (b_0 + b_i x_i)$
\2 Take the mean over all $i$: $\overline{x} = \overline{y} - b_0 - b_1 \overline{x}$
\2 Set mean error to 0 to get $b_0 = \overline{y} - b_1 \overline{x}$
\2 Now we just need $b_1$
\1 Deriving $b_1$ is harder
\2 SSE = sum of errors squared over all $i$
\2 We want a minimum value for this
\2 It's a function with one local maximum
\2 So we can differentiate and look for zero
\2 $s_y^2 - 2b_1s^2_{xy} + b_1^2s_x^2$, then take derivative
\2 $s_{xy}$ is correlation coefficient of $x$ and $y$ (see p. 181)
\2 In the end, gives us $b_1 = \frac{s^2_{xy}}{s_x^2}$
\3 Correlation of $x$ and $y$ divided by variance of $x$
\3 $\frac{\sum{xy} - n \overline{x} \overline{y}}{\sum{x^2} - n(\overline{x})^2}$
\1 For next time
\end{outline}
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment