lecturenotes.tex 5.06 KB
 Robert Ricci committed Mar 25, 2014 1 2 3 4 5 6 7 8 9 10 11 12 \documentclass{article}[12pt] \usepackage[no-math]{fontspec} \usepackage{sectsty} \usepackage[margin=1.25in]{geometry} \usepackage{outlines} \setmainfont[Numbers=OldStyle,Ligatures=TeX]{Equity Text A} \setmonofont{Inconsolata} \newfontfamily\titlefont[Numbers=OldStyle,Ligatures=TeX]{Equity Caps A} \allsectionsfont{\titlefont}  Robert Ricci committed Mar 03, 2015 13 \title{CS6963 Lecture \#1}  Robert Ricci committed Mar 25, 2014 14 \author{Robert Ricci}  Robert Ricci committed Mar 03, 2015 15 \date{March 4, 2014}  Robert Ricci committed Mar 25, 2014 16 17 18 19 20 21 22  \begin{document} \maketitle \begin{outline}  Robert Ricci committed Mar 03, 2015 23 24 25 \1 Today: How well does your data fit a line? \2 More complicated regressions exist, of course, but we'll stick with this one for now \2 Eyeballing is just not rigorous enough  Robert Ricci committed Feb 24, 2015 26   Robert Ricci committed Mar 03, 2015 27 28 29 30 31 32 33 34 35 \1 Basic model: $y_i = b_0 + b_1x_i + e_i$ \2 $y_i$ is the prediction \2 $b_0$ is the y-intercept \2 $b_1$ is the slope \2 $x_i$ is the predictor \2 $e_i$ is the error \2 \textit{Which of these are random variables?} \3 A: All but $x_i$ the $b$s are estimated from random variables, $e$ is difference between random variables \3 So, we can compute statistics on them  Robert Ricci committed Feb 24, 2015 36   Robert Ricci committed Mar 03, 2015 37 38 39 40 41 42 \1 Two criteria for getting $b$s \2 Zero total error \2 Minimize SSE (sum of squared errors) \2 Example of why one is not enough: two points, infinite lines with zero total error \2 Squared errors always positive, so this criterion alone could overshoot or undershoot  Robert Ricci committed Feb 24, 2015 43   Robert Ricci committed Mar 03, 2015 44 45 46 47 48 \1 Deriving $b_0$ is easy \2 Solve for $e_i$: $y_i - (b_0 + b_i x_i)$ \2 Take the mean over all $i$: $\overline{x} = \overline{y} - b_0 - b_1 \overline{x}$ \2 Set mean error to 0 to get $b_0 = \overline{y} - b_1 \overline{x}$ \2 Now we just need $b_1$  Robert Ricci committed Feb 24, 2015 49   Robert Ricci committed Mar 03, 2015 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 \1 Deriving $b_1$ is harder \2 SSE = sum of errors squared over all $i$ \2 We want a minimum value for this \2 It's a function with one local maximum \2 So we can differentiate and look for zero \2 $s_y^2 - 2b_1s^2_{xy} + b_1^2s_x^2$, then take derivative \2 $s_{xy}$ is correlation coefficient of $x$ and $y$ (see p. 181) \2 In the end, gives us $b_1 = \frac{s^2_{xy}}{s_x^2}$ \3 Correlation of $x$ and $y$ divided by variance of $x$ \3 $\frac{\sum{xy} - n \overline{x} \overline{y}}{\sum{x^2} - n(\overline{x})^2}$ \1 SS* \2 SSE = Sum of squared errors \2 SST = total sum of squares (TSS): difference from mean \2 SS0 = square $\overline{y}$ $n$ times \2 SSY = square of all $y$, so SST = SSY - SS0 \2 SSR = Error explained by regression: SST - SSE  Robert Ricci committed Feb 24, 2015 67   Robert Ricci committed Mar 03, 2015 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 \1 Point of above: we can talk about two sources that explain variance: sum of squared difference from mean, and sum of errors \2 $R^2 = \frac{SSR}{SST}$ \2 The ratio is the amount that was explained by the regression - close to 1 is good (1 is max possible) \2 If the regression sucks, SSR will be close to 0 \1 Remember, our error terms and $b$s are random variables \2 We can calculate stddev, etc. on them \2 Variance is $s_e^2 = \frac{SSE}{n-2}$ - MSE, mean squared error \2 Confidence intervals, too \2 \textit{What do confidence intervals tell us in this case?} \3 A: Our confidence in how close to the true slope our estimate is \3 For example: How sure are we that two slopes are actually different \2 \textit{When would we want to show that the confidence interval for $b_1$ includes zero?} \1 Confidence intervals for predictions \2 Confidence intervals tightest near middle of sample \2 If we go far out, our confidence is low, which makes intuitive sense \2 $s_e \big(\frac{1}{m} + \frac{1}{n} + \frac{(x_p - \overline{x}^2)}{\sum_{x^2} - n \overline{x}^2}\big)^\frac{1}{2}$ \2 $s_e$ is sttdev of error \2 $m$ is how many predictions we are making \2 $p$ is value at which we are predicting ($x$) \2 $x_p - \overline{x}$ is capturing difference from center of sample \2 \textit{Why is it smaller for more $m$}? \3 Accounts for variance, assumption of normal distribution \1 Residuals \2 AKA error values \2 We can expect several things from them if our assumptions about regressions are correct \2 They will not show trends: \textit{why would this be a problem} \3 Tells us that an assumption has been violated \3 If not randomly distributed for different $x$, tells us there is a systematic error at high or low values - error and predictor not independent \2 Q-Q plot of error distribution vs. normal ditribution \2 Want the spread of stddev to be constant across range  Robert Ricci committed Feb 24, 2015 102 103  \1 For next time  Robert Ricci committed Mar 03, 2015 104 105 106 107 108 109 110 111 112 113 114 115 116  \2 Start filling out your section in cs6963-lab1 repo \2 Be careful to only modify parts of the .tex file for your section \3 Unless you want to suggest a broader change \2 Fork it, give your parter access, send me a merge request before the start of class Thursday \2 Check in any notes you create, reference papers \2 You are empowered to make decisions \2 Goal is to describe in sufficient detail that people can start implementing \2 We will try to finish up our plan by deciding what experiments to run and how to present results on Thursday \2 Need next two paper volunteers, let's get them out before spring break  Robert Ricci committed Mar 25, 2014 117 118 119 120  \end{outline} \end{document}