Commit 158986f7 authored by Robert Ricci's avatar Robert Ricci

Lecture notes

parent 66ce841e
......@@ -14,8 +14,107 @@
\begin{outline}
\1 Interesting things from the Remy study
\2 The authors of this paper did a pretty good job
\2 It was pretty easy to do
\2 Most people felt more confident in the results after
\3 But did not necessarily understand what was going on better
\3 Tradeoff between making it easy and teaching something
\3 Know your audience; you don't have to teach the basics
\2 Results came out similar, but not identical (Vegas consistently
had something funny going on)
\2 Parameters / reasons behind them were not documented
\2 It was not easy to discover how to re-run individual experiments
\3 Probably not all that hard to actually do, but hard to discover how
\2 One student suggested making VM images available
\3 \textit{What would the downside be?}---even more opaque
\2 The authors' instituion is not useful input into whether you should
trust a paper
\3 Established credibility of individual authors can be, though
\3 But you should mostly be looking at the claims of the paper and
how well they are supported, both in the body of the paper and
in things it references
\3 This is a scientific process, not a popularity contest
\1 Definitions
\2 Repeatability: Do the same thing again, get the same results
\3 Requires access to the code and same environment
\2 Reproducibility: Do it independently
\3 Validate the idea, not just the code or environment
\3 Does not necessarily require code
\2 Benefaction: Avoid needless replication of work
\3 I don't like this characterization
\3 We do it for selfish reasons too
\3 In systems, we build on code not just algorithms
\1 Why I had you read the Collberg paper
\2 To get some important definitions
\2 To understand the state of reproducibility in our field
\2 To get a sense of what you are in for for the class project
\1 Main findings of the study
\2 Code is hard to get
\2 And often hard to build when you can get it
\1 Why are reproducibility and repeatability in CS so hard?
\2 ``All'' you have to run software!
\2 \textit{Brainstorm on ideas}
\3 Experiment environments hard to set up
\3 Gathering dependences - esp. when those dependencies are themselves
research artifacts
\3 May not be documented
\3 Experiments may not be scripted
\3 May require special hardware
\3 May require access to licensed software
\3 Code not released
\3 Visioning
\2 Why people don't release code
\3 Afraid of having to support it
\3 Afraid others will scoop them
\3 Lose it
\3 License agreement with industrial funder/partner
\3 Control over use (don't use it unoptimized or in inappropriate
settings)
\3 Afraid of having someone else show that they can do better
\textbf{and the obvios flaw with this reasoning}
\2 Cultural problem:
\3 We are not rewarded for making it easy
\3 Or for doing reproductions
\1 Lessons from Collberg paper
\2 \textit{How many of these are feasible?}
\2 \textit{Which are most important?}
\2 Unless you have compelling reasons not to, plan to release the code
\2 Students will leave, plan for it
\2 Create permanent email addresses
\2 Create project websites
\2 Use a source code control system
\2 Backup your code
\2 Resolve licensing issues
\2 Keep your promises
\2 Plan for longevity
\2 Avoid cool but unusual designs
\2 Don't rely on the permanence of external software
\2 Plan for repeatable releases.
\1 What we are going to for final project
\2 Pick a paper
\2 Pick one key graph
\2 Get the code
\2 Try to reproduce that graph
\2 Improve the evaluation some way (measure variance, compare to
a different system, improve representativeness or environment or
workload, etc.)
\2 Make a profile for CloudLab/Apt
\2 Document how you did it
\2 Start by picking two papers, picking the graph, and finding the code
\2 You are probably not going to do both, I want a backup in case
one turns out to be too hard
\1 For next time
\2 Read Chapter 14, linear regression
\2 Think of some papers, due as an assignment by Thursday
\2 Don't forget about papers3
\end{outline}
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment