All new accounts created on Gitlab now require administrator approval. If you invite any collaborators, please let Flux staff know so they can approve the accounts.

### Finish notes

parent 49ef46fe
 \documentclass{article}[12pt] \usepackage[no-math]{fontspec} \usepackage{sectsty} \usepackage[margin=1.25in]{geometry} \input{../../texstuff/fonts.sty} \input{../../texstuff/notepaper.sty} \usepackage{outlines} \setmainfont[Numbers=OldStyle,Ligatures=TeX]{Equity Text A} \setmonofont{Inconsolata} \newfontfamily\titlefont[Numbers=OldStyle,Ligatures=TeX]{Equity Caps A} \allsectionsfont{\titlefont} \title{CS6963 Lecture \#11} \author{Robert Ricci} \date{February 13, 2014} \date{February 19, 2014} \begin{document} ... ... @@ -27,19 +21,38 @@ \1 Quick refresher - sample vs. population \2 Parameters of prob distribution vs. statistics of the sample \1 The value of hypothesis testing \2 State your goal, test whether or not you achieved it \2 eg. a good thesis statement is a testable hypothesis \2 X is faster than Y \2 Z has negligible overhead \1 We measure a sample mean, but it is really just an estimate of population mean \2 We can get a confidence interval that the true mean is within some range: significance level / confidence level \2 \textit{What is the full population here, and what is the sample?} \3 Full populations: All possible executions of our experiment \3 Sample: The ones we actually run \2 We can get a confidence interval that helps us understand what we would get if we ran the experiment more times. \2 Book explanation of way to get confidence level \3 Get multiple samples (multiple trials per sample), compute stats on the means, treat that as a sample set and take confidence intervals \2 Again, iid comes up, and this is why you need to be careful in experiment design \3 \textit{When might you not meet identically distributed criteria?} \3 \textit{When might you not meet identically distributed criteria?} Caching, warmup, different conditions over time of day, etc. \2 Standard error --- not to be confused with standard deviation or STDERR \3 Different samples drawn from the same population would have different means \3 The standard error of the sample mean is how close to that real mean you can expect to get \3 We are viewing the set of sample means as a distribution and basically looking at its variance \3 \ldots using the normal distribution, thanks to cool properties of that distribution WRT iid variables \2 Don't confuse with credible intervals \3 Probability of the true values \3 Requires a priori knowledge/estimation of distribution \1 Confidence interval for sample mean \2 Lower: $\overline{x} - \frac{z_{1-\alpha/2}s}{\sqrt{n}}$ \2 Upper: $\overline{x} + \frac{z_{1-\alpha/2}s}{\sqrt{n}}$ \2 \textit{Why is this symmetric?} \3 Because we're modeling the sample means using the normal distribtion \2 $\overline{x}$ is sample mean \2 $s$ is sample stddev \2 $z_{1-\alpha/2}$ is $(1 - \alpha/2)$ quantile of unit normal dist ($\mu = 0$ and $\sigma = 1$) - note, you are picking $\alpha$ ... ... @@ -66,14 +79,6 @@ \2 If CI contains zero, not statistically different: The hypothesis the two systems are the same'' is supported by the data \1 Showing significance: t-test (eg. truly random samples) \2 Best to leave the implementation of this up to someone else \2 Degrees of freedom: number of independent sources of data that go into the model: number of samples minus steps that go into the estimation \2 eg. R includes this as a module \2 Fun fact: t-test invented as a way of measuring the quality of beer (Guinness Stout) \1 Showing significance: visual check \2 Draw both confidence intervals and means \2 If CIs don't overlap, one is clearly better ... ... @@ -82,36 +87,48 @@ \2 If the mean of one is in the CI of the other, but this is not true for both, t-test required \1 Showing significance: t-test (eg. truly random samples) \2 Best to leave the implementation of this up to someone else \2 Degrees of freedom: number of independent sources of data that go into the model: number of samples minus steps that go into the estimation \2 eg. R includes this as a module \2 Fun fact: t-test invented as a way of measuring the quality of beer (Guinness Stout) \1 Picking CIs \2 As discussed before, degree of confidence has to do with the gain/loss of being outside the range \2 Reiterate plane example, you don't want to fly on a plane built with only 99\% confidence intervals \1 The value of hypothesis testing \2 State your goal, test whether or not you achieved it \2 On Bullshit'' \2 eg. a good thesis statement is a testable hypothesis \1 Proportions \2 Similar, but for categorical outcomes (range not domain) \2 What proportion of the population consists of category X? \2 Sample proportion: $\frac{n_1}{n}$ \2 CI for sample proportion $p \mp z_{1-\alpha/2} \sqrt{\frac{p(1-p)}{n}}$ \2 $np > 10$ required \2 \textit{Why is this symmetric?} \1 Picking a sample size \2 Sample size being too big is rarely a problem \2 It's just that it can take too much time to get that many samples \2 All dependent on the variance, which is intuitive \2 $n = \left(\frac{100zs}{r\overline{z}}\right)^2$ \2 $n = z^2\frac{p(1-p)}{r^2}$ \2 For comparing two, upper edge of lower must be below lower edge of upper \2 $x \mp z \frac{s}{\sqrt{n}}$ \2 Leave $n$ unbound, set the plus and minus versions with the appropriate comparison operator, solve for $n$ \1 Picking a sample size \2 \emph{What should our goal be when picking a sample size?} \3 Give us a high degree that our sample mean is close to the population mean \3 While taking a reasonable amount of time to run experiments \2 Data dependent - on variance, which is intuitive \2 Getting a good mean \3 Pick the confidence level we want (say, 95\%) \3 Pick the accuracy level (how far on either side) \3 The accuracy equal to the confidence bounds, solve for $n$ \3 $n = \left(\frac{100zs}{r\overline{x}}\right)^2$ \3 $s$ is sample standard deviation \3 $\overline{x}$ is sample mean \3 $z$ is as above ($z_{1-\frac{\alpha}{2}}$, quantile of unit normal distribution (note, contains the confidence level) \3 $r$ - accuracy in plus or minus $r\%$ %\2 $n = z^2\frac{p(1-p)}{r^2}$ \2 Comparing two systems \3 Goal is to end up with non-overlapping intervals at some confidence level \3 For comparing two, upper edge of lower must be below lower edge of upper \3 Run some sample experiments to get an initial mean and sttdev for each \3 Fill in all numbers for both confidence intervals except $n$ \3 Set the upper bound of one to be lower than the lower bound of the other \3 Solve for $n$ \1 For next time \2 Bring your laptop ... ...
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!