From e2b924565f4b0594d63a619b998a4223612ed017 Mon Sep 17 00:00:00 2001 From: Robert Ricci Date: Mon, 16 Feb 2015 16:01:05 -0800 Subject: [PATCH] More updating left some XXXes to look at later --- lectures/lecture10/lecturenotes.tex | 36 ++++++++++++++--------------- 1 file changed, 17 insertions(+), 19 deletions(-) diff --git a/lectures/lecture10/lecturenotes.tex b/lectures/lecture10/lecturenotes.tex index 53019cf..4152a70 100644 --- a/lectures/lecture10/lecturenotes.tex +++ b/lectures/lecture10/lecturenotes.tex @@ -25,9 +25,14 @@ of the statistics; eg. sample mean \2 \textit{When might your samples not be independent of each other?} \3 This is key because a lot of statistical tests require iid variables + \3 Throughput and latency + \3 Arrival times between events + \3 Two properties of an events (eg. read/write and latency) \2 You can multiply together probs. when independent, have to start using conditional probabilities when not + \3 Example: Sampling with replacement, sampling w/o replacement \2 \textit{Why do we consider our measurements random variables?} + \3 They are affected by underlying random processes \2 CDF vs. PDF vs. PMF \2 \textit{How do we calculate probability a value will be within a range?} \3 Integral (CDF) at point $b$, minus integral at point $a$ @@ -37,6 +42,7 @@ \2 Probability must be in range 0 to 1 \2 Independence \2 Adding: mostly used for mutually exclusive events in the same trial + \3 eg. prob. of a write is prob. of insert plus prob. of update \2 Multiplication: Used to calculate probability across multiple trials \2 Sampling w/ replacement vs. w/o replacement: relationship to independence @@ -44,10 +50,9 @@ \2 The value you can expect to get'' \2 AKA the mean \2 PDF / PMF is balanced on the expected value - \2 Variance (sigma squared) is the expected deviation from the mean - (squared) + \2 Variance (sigma squared) is the expected deviation from the mean \\ (squared) \2 $E[X]E[Y]$ is expected value of $X$ times expected value of $Y$ - \2 $E[XY]$ is expected value of $X * Y$ + \2 $E[XY]$ is expected value of $X * Y$ (joint probability) \2 Linearity of expectation \1 Mean, median, mode @@ -57,7 +62,7 @@ \2 Mistakes with means: large range, skewness, multiplying when not independent, ratio with different bases -\1 Covariance +\1 Covariance XXX \2 Measure whether two random variables vary together \2 Sign shows the tendency (together, or opposite) \2 Joint probability distribution: eg. A given B @@ -79,24 +84,19 @@ \2 Don't look at skewness \2 Can only multiply means if independent \2 \textit{When to use arithmetic vs. Geometric vs. harmonic mean} - -\1 Means of ratios - \2 Case 1: Sum of numerators and denominators both have physical meanings - \3 eg. sum of CPU busy times over sum of experiment durations - \2 Case 1a: Arithmetic mean can be used if bases are constant - \2 Case 1b: Harmonic mean can be used if numerators are constant - \2 Case 2: If cases are expected'' to be $a_i = cb_i$, can estimate - $c$ by taking geometric mean - + \3 Total is of interest (eg. time), product is of interest (eg. + speedup) \1 Picking index of dispersion \2 Range (when bounded) + \3 Use a variance based metric when using mean, using a percentile based + metric when using median \2 Var or stddev (sttdev is in the right units) --- see also CoV \2 Percentiles - 10 and 90, or 5 and 95 (want a sense of how long things will take in extreme case - \2 SIQR (semi interquanile range): middle 50% / 2: very outlier-robust + \2 SIQR (semi interquanile range): middle 50\% / 2: very outlier-robust \2 Mean absolute dev (use least) -\1 Quantile-quantile plots +\1 Quantile-quantile plots XXX \2 For each quartile: plot pairs of what the theoretical distribution should be, and what the empirical (sample) distribution actually is \2 $x$-axis: theoretical distribution @@ -106,10 +106,8 @@ \2 Heavy tail / light tail \1 For next time - \2 HW \#5 due tonight - \2 Reach Chapter 13 on comparing systems - \2 HW \#6 posted - \3 There is a part that you need to do \emph{before} class + \2 Read Chapter 13 on comparing systems + \2 HW \#5 due Friday -- GitLab