\1 Today: How well does your data fit a line?
\2 More complicated regressions exist, of course, but we'll stick with this one for now
\2 Talk about linear in detail, look at some more complicated ones in R
\2 Eyeballing is just not rigorous enough
\1 Basic model: $y_i = b_0 + b_1x_i + e_i$
\3 For example: How sure are we that two slopes are actually different
\2 \textit{When would we want to show that the confidence interval for $b_1$ includes zero?}
\1 Confidence intervals for predictions
\2 Confidence intervals tightest near middle of sample
\2 If we go far out, our confidence is low, which makes intuitive sense
\2 $s_e \big(\frac{1}{m} + \frac{1}{n} + \frac{(x_p - \overline{x}^2)}{\sum_{x^2} - n \overline{x}^2}\big)^\frac{1}{2}$
\2 $s_e$ is sttdev of error
\2 $m$ is how many predictions we are making
\2 $p$ is value at which we are predicting ($x$)
\2 $x_p - \overline{x}$ is capturing difference from center of sample
\2 \textit{Why is it smaller for more $m$}?
\3 Accounts for variance, assumption of normal distribution
\1 Residuals
\2 AKA error values
\2 We can expect several things from them if our assumptions about regressions are correct
\2 Q-Q plot of error distribution vs. normal ditribution
\2 Want the spread of stddev to be constant across range
\1 Switch to R
\2 Show example of linear fitting (good fit)
\2 Show example of linear fitting (bad fit)
\2 Show example of polynomial fit (intercept and 3 coefficients)
\1 For next time
\2 I won't be here week after spring break
\2 papers3 due Tuesday of spring break week
\2 lab2 now due Friday after spring break
\3 I want some more from you now, so be sure to update your fork
\3 Mainly, I want to know how you will improve the graph you
are reproducing, and to actually look a bit at the code you
