More details on covariance and qq plots

\2 Mistakes with means: large range, skewness, multiplying when not
independent, ratio with different bases
\1 Covariance
\2 Measure whether two random variables vary together
\2 Sign shows the tendency (together, or opposite)
\2 Joint probability distribution: eg. A given B
\2 Variance of a random var is covariance with itself
\2 $Cov(x,y) = E[xy] - E[x]E[y]$
\2 $Cov(x,y) = E[(x - E[x])(y-E[y])]$
\2 $cov(x,y) = \sum_{i=1}^{N}\frac{(x_i - \bar{x})(y_i-\bar{y})}{N}$
\2 Sample covar: Sum over all, divide by $n-1$
\2 In other words: if both give high variance ``at the same time'',
you get high covar
\2 Positive if they move together, negative if they move in opposite
\2 Correlation is covariance divided by product of variances
\1 Mistakes with means
\2 SIQR (semi interquanile range): middle 50\% / 2: very outlier-robust
\2 Mean absolute dev (use least)
\1 Quantile-quantile plots
\2 For each quartile: plot pairs of what the theoretical distribution
should be, and what the empirical (sample) distribution actually is
\2 For example: take the 5th percentile from each
\2 $x$-axis: theoretical distribution
\2 $y$-axis: empirical distribution
\2 To plot theoretic quantile, have to invert the CDF
\1 For next time
\2 Read Chapter 13 on comparing systems
\2 HW \#5 due Friday
\2 Next time, we'll talk about confidence intervals and how to pick
winners and losers
