Commit e2b92456 authored by Robert Ricci's avatar Robert Ricci

More updating left some XXXes to look at later

parent d5c40ecb
......@@ -25,9 +25,14 @@
of the statistics; eg. sample mean
\2 \textit{When might your samples not be independent of each other?}
\3 This is key because a lot of statistical tests require iid variables
\3 Throughput and latency
\3 Arrival times between events
\3 Two properties of an events (eg. read/write and latency)
\2 You can multiply together probs. when independent, have to start using
conditional probabilities when not
\3 Example: Sampling with replacement, sampling w/o replacement
\2 \textit{Why do we consider our measurements random variables?}
\3 They are affected by underlying random processes
\2 CDF vs. PDF vs. PMF
\2 \textit{How do we calculate probability a value will be within a range?}
\3 Integral (CDF) at point $b$, minus integral at point $a$
......@@ -37,6 +42,7 @@
\2 Probability must be in range 0 to 1
\2 Independence
\2 Adding: mostly used for mutually exclusive events in the same trial
\3 eg. prob. of a write is prob. of insert plus prob. of update
\2 Multiplication: Used to calculate probability across multiple trials
\2 Sampling w/ replacement vs. w/o replacement: relationship to independence
......@@ -44,10 +50,9 @@
\2 ``The value you can expect to get''
\2 AKA the mean
\2 PDF / PMF is balanced on the expected value
\2 Variance (sigma squared) is the expected deviation from the mean
\2 Variance (sigma squared) is the expected deviation from the mean \\ (squared)
\2 $E[X]E[Y]$ is expected value of $X$ times expected value of $Y$
\2 $E[XY]$ is expected value of $X * Y$
\2 $E[XY]$ is expected value of $X * Y$ (joint probability)
\2 Linearity of expectation
\1 Mean, median, mode
......@@ -57,7 +62,7 @@
\2 Mistakes with means: large range, skewness, multiplying when not
independent, ratio with different bases
\1 Covariance
\1 Covariance XXX
\2 Measure whether two random variables vary together
\2 Sign shows the tendency (together, or opposite)
\2 Joint probability distribution: eg. A given B
......@@ -79,24 +84,19 @@
\2 Don't look at skewness
\2 Can only multiply means if independent
\2 \textit{When to use arithmetic vs. Geometric vs. harmonic mean}
\1 Means of ratios
\2 Case 1: Sum of numerators and denominators both have physical meanings
\3 eg. sum of CPU busy times over sum of experiment durations
\2 Case 1a: Arithmetic mean can be used if bases are constant
\2 Case 1b: Harmonic mean can be used if numerators are constant
\2 Case 2: If cases are ``expected'' to be $a_i = cb_i$, can estimate
$c$ by taking geometric mean
\3 Total is of interest (eg. time), product is of interest (eg.
\1 Picking index of dispersion
\2 Range (when bounded)
\3 Use a variance based metric when using mean, using a percentile based
metric when using median
\2 Var or stddev (sttdev is in the right units) --- see also CoV
\2 Percentiles - 10 and 90, or 5 and 95 (want a sense of how long
things will take in extreme case
\2 SIQR (semi interquanile range): middle 50% / 2: very outlier-robust
\2 SIQR (semi interquanile range): middle 50\% / 2: very outlier-robust
\2 Mean absolute dev (use least)
\1 Quantile-quantile plots
\1 Quantile-quantile plots XXX
\2 For each quartile: plot pairs of what the theoretical distribution
should be, and what the empirical (sample) distribution actually is
\2 $x$-axis: theoretical distribution
......@@ -106,10 +106,8 @@
\2 Heavy tail / light tail
\1 For next time
\2 HW \#5 due tonight
\2 Reach Chapter 13 on comparing systems
\2 HW \#6 posted
\3 There is a part that you need to do \emph{before} class
\2 Read Chapter 13 on comparing systems
\2 HW \#5 due Friday
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment