From 8356b2757fa0c8f376e7e79f5168128cc1b929be Mon Sep 17 00:00:00 2001 From: Kirk Webb <kwebb@flux.utah.edu> Date: Tue, 2 Nov 2004 23:29:29 +0000 Subject: [PATCH] Moving plab text from public doc space to private. --- doc/plab/APIcomp | 234 ----------------------------------- doc/plab/Failmodes | 77 ------------ doc/plab/Outline | 71 ----------- doc/plab/abs.tex | 3 - doc/plab/abs.txt | 17 --- doc/plab/conclusion.tex | 1 - doc/plab/defs.tex | 135 -------------------- doc/plab/design.tex | 13 -- doc/plab/eval.tex | 15 --- doc/plab/features.tex | 110 ---------------- doc/plab/from-nsf-report.tex | 75 ----------- doc/plab/goals.tex | 2 - doc/plab/impl.tex | 207 ------------------------------- doc/plab/intro.tex | 6 - doc/plab/outline-new.txt | 132 -------------------- doc/plab/pgformat.tex | 104 ---------------- doc/plab/plab.tex | 82 ------------ 17 files changed, 1284 deletions(-) delete mode 100644 doc/plab/APIcomp delete mode 100644 doc/plab/Failmodes delete mode 100644 doc/plab/Outline delete mode 100644 doc/plab/abs.tex delete mode 100644 doc/plab/abs.txt delete mode 100644 doc/plab/conclusion.tex delete mode 100644 doc/plab/defs.tex delete mode 100644 doc/plab/design.tex delete mode 100644 doc/plab/eval.tex delete mode 100644 doc/plab/features.tex delete mode 100644 doc/plab/from-nsf-report.tex delete mode 100644 doc/plab/goals.tex delete mode 100644 doc/plab/impl.tex delete mode 100644 doc/plab/intro.tex delete mode 100644 doc/plab/outline-new.txt delete mode 100644 doc/plab/pgformat.tex delete mode 100644 doc/plab/plab.tex diff --git a/doc/plab/APIcomp b/doc/plab/APIcomp deleted file mode 100644 index 480481fef3..0000000000 --- a/doc/plab/APIcomp +++ /dev/null @@ -1,234 +0,0 @@ -**** The Emulab-Planetlab interface: yesterday, today, and tomorrow. - -Emulab constantly grapples with the changing face of Planetlab. The -method for creating experiments containing planetlab nodes (vservers -on physical nodes) has gone from a mostly decentralized, high speed -interface, to a monolithic central one, and now appears to be moving -back to a more distributed arrangement. - -The original dslice interface was closer to ideal, and they may get -back there eventually (we hope). In this interface, a central broker -dealt out tickets for lease redemption at individual node managers. -Querying node availability and obtaining tickets were the only -centralized operations, and they were quick to execute. A "slice" was -only logically represented, comprised of the constituent nodes where -tickets were traded for vserver leases; there was no central notion of -the slice at planetlab. setup, manipulation, and renewal were all -negotiated with individual node managers. - -The worst of the interfaces, perhaps as bad as could be imagined, was -the original centralized PLC API. It was completely asynchronous with -horrible execution time (1 hr. sliver instantiation), no remote setup -failure indicators, and no (non ad hoc) way of querying the readiness -or status of a particular sliver. This prompted Emulab to push for a -synchronous call that would return upon sliver instantiation. While -this was implemented, it was far from perfect with numerous failure -modes and a high failure rate. - -The new API decouples the portion of monolithic PLC that actually -creates the slivers and gives it back to the individual node managers. -It can, optionally, do all the work w/o requiring the user to go to -the node managers though. - -**** Differences in Planetlab APIs: - -** Calls used in dslice: - -getads(): - -Get dslice advertisements (central agent). Used to find out which -nodes were currently up and running dslice. Returned a list of IP -addresses. - -newtickets(slice, ntickets, leaselen, ips): - -Grab tickets from central dslice agent to redeem for sliver leases via -particular dslice node managers. Arguments are: slice to perform action for; -number of tickets; lease length; and ip addresses of the recipient nodes. -Returned a structure representing a dslice ticket. - -newleasevm(ticketdata, privatekey, publickey): - -Redeem ticket(s) in exchange for a lease on a particular node. This RPC is -issued to each node's dslice nodemanager for which we have a ticket we wish -to redeem for a lease. A new vserver was created for the specified slice if -it did not already exist. It was also a convenience function, wrapping the -newlease() and newvm() calls. Returned a dslice lease structure. - -deletelease(slice): - -Revoke the lease for a slice on a particular node. This releases the vserver -allocated on the node for this slice (the sliver). This RPC is issued to the -individual dslice node managers when tearing down a plab vserver setup by -Emulab. Succeeded or threw an exception. - -addkey(slice, key): - -Add the public portion of an ssh keypair to the slice account on a particular -node. We called this when creating new Plab vservers to put boss' public -key into the authorized_keys file for the slice user inside the vserver. -Succeeded or threw an exception. - -renewlease(slice): - -Renew existing dslice leases on individual nodes. As no formal credit tracking -system was ever put into place in dslice, new tickets did not have to be -presented in order to renew. Renewal was assumed to have a time span of the -same length as the original lease length requested. Returned a new lease -structure for the renewed lease. - - -** Calls used in PLC: - -Unless otherwise stated, calls to PLC did not immediately affect plab nodes. -Instead, the central DB was updated, and the nodes would notice the change the -next time they checked in (up to an hour later). - -createSlice(): - -Created a new PLC slice; no nodes were affected at this point. - -deleteSlice(): - -Removed a previously created PLC slice. If any nodes were participating in -the slice, the vservers they hosted for it would eventually get reclaimed. - -AssignNodes(): - -Assigned a set of nodes to the slice. Slivers would eventually get setup -on them. - -UnAssignNodes(): - -Removed a set of nodes from the slice. Slivers on participating nodes would -eventually get removed. - -AssignUsers(): - -Added a set of (PLC registered) users to the slice. This allowed those uses -to log into the slice as the slice user using the ssh keypair they have -registered with Plab. We only added the fake boss server account which was -used to bootstrap the Emulab vserver environment. - -AssignShares(): - -Add shares (and a lease length) to the slice. Had to be done before any nodes -could be added to the slice. Note that we really only used this to set the -lease length. The shares were ill defined and unused in this API; we just -added an arbitrary amount to the slice (and we essentially had an infinite -pool of them to use). - -InstantiateSliver(): - -A special call added by PLC just for Elab to eliminate the long wait necessary -to ensure a vserver had been setup for a slice (node checked into PLC) before -trying to use it. It effectively "pushed" the setup out from PLC, but failed -a good part of the time and took an unreasonable amount of time in many cases. -It did provide us with a synchronous interface for creating plab vnodes though. - -listSlice(): - -A call who's semantics changed several times. It was able to list all slices -belonging to a particular site (and the participating plab nodes) or list -the nodes participating in a particular slice. It also provided other info -on the slices, such as their recorded expiration time. We used this last -bit of info to sync our notion of expiration for Elab created slices w/ PLC's. - -** Calls potentially used in PLC/NM hybrid API: - -Shares will (alegedly) become first class citizens in this newest incarnation -of the PLC API. Part of the interface to the local Nodemanager has also been -exposed, but the details are sketchy at this point. - -The API presented on the PLC wiki at -https://www.planet-lab.org/Wiki/bin/view/Planetlab/PlcAPI -doesn't make it clear how shares are presented/acquired. They may be part of -the 'auth' parameter required by most calls. - -SliceCreate(): - -Register the slice with PLC database. - -SliceDelete(): - -Remove the slice from PLC db. - -SliceRenew(): - -Extend lease for slice. The affect this call has is unclear if the -slice is not 'instantiated'. Its also not clear how share are recycled -and reapplied (if necessary at all). - -SliceUsersAdd(): - -Probably identical to AssignUsers() in PLC v2 above. - -SliceNodesAdd(): - -Probably identical to AssignNodes() in PLC v2. - -SliceNodesDel(): - -Probably identical to UnAssignNodes() in PLC v2. - -SliceNodesList() and SliceInfo(): - -Will be used to verify integrity/status of slices. - -*Attr*(): - -No idea what the planned/envisioned use for these slice attribute manipulation -functions are on PLC's end, so not sure we will need to use them. - -SliceInstanceStart(): - -Not sure we will use this. We may go straight to the individual node managers -and invoke create_sliver() (or whatever they end up calling it). Depends -on the semantics of this call; may have to call it to "activate" the slice. - -SliceInstanceUpdate(): - -No idea. May have to use this to manage shares. May push out changes w.r.t. -node membership for a slice. - -SliceTicketGet(): - -The is presumably where we will present shares to obtain tickets for use with -the node manager. - -SliceTicketAuthorize(): - -The description for this function is confusing. It claims to create the slice, -but other calls say they do the same thing. May be the mechanism for -presenting tickets obtained with the former call to PLC (if you want to use it -rather than the node manager to create the slivers). - -**** Rough translation of API functionality (needed by Elab): - -dslice PLC PLC/NM hybrid ------------------------------------------------------------------------------ - -getads() <parse static XML file> <parse static XML file?> - -newtickets() <N/A> SliceTicketGet()? - -newleasevm() AssignShares() SliceTicketAuthorize()? - AssignNodes() SliceNodesAdd() - InstantiateSliver() create_sliver() - -deletelease() UnAssignNodes() SliceNodesDel() - -addkey() AssignUsers() SliceUsersAdd() - -renewlease() AssignShares() SliceTicketGet()? - SliceRenew() - -<N/A> createSlice() SliceCreate() - SliceInstanceStart()? - -<N/A> deleteSlice() SliceDelete() - -<N/A> listSlice() SliceInfo() - SliceNodesList() - - diff --git a/doc/plab/Failmodes b/doc/plab/Failmodes deleted file mode 100644 index 6e53234f89..0000000000 --- a/doc/plab/Failmodes +++ /dev/null @@ -1,77 +0,0 @@ -Observed Failure modes in the Plantelab programmatic interface - -dslice: - -* Improperly initialized vservers; broken passwd file - -* No vserver setup at all; vserver creation race - -* Incomplete dslice service deployment (coverage not 100%) - -PLC (where do we start :) - -* NodeID is a moving target - -There is no one identifier that is gauranteed to remain fixed in the PLC -database for any particular node. Node index, IP, and hostname can and -have been observed to change (sometimes two simultaneously). - -* Disappearing slices - -We have seen slices simply cease to exist before they have expired. - -* Renewal mechanism does not enforce stated policy limits on duration. - -Leases can be pushed far into the future despite the state two month -maximum (enforced by the PLC web interface, but not the prog API). - -* InstantiateSliver(): "node is not responding" - -The reason behind this is not always clear. Our three try redundancy -doesn't seem to help in this case, though trying perhaps five minutes later -sometimes succeeds. - -* InstantiateSliver(): indefinite hang - -We've seen this one often; a call to IS simply hangs and doesn't return in -a reasonable amount of time (we've let the call sit for an hour at most). -Seems to imply a lack of robustness in the IS semantics (duh..) - -* InstantiateSliver(): "error" - -A grossly defined condition that is often recoverable upon trying the -operation again after a few second delay. - -* Inaccessible slivers: - -Newly created slivers are not always accessible via ssh. Access is simply -denied, even though the node is listed as a member of the slice in PLC. This -condition is rarely seen. - -* Delayed sliver reaping leads to dirty reassignment: - -If an attempt is made to create a sliver on a particular node for a -particular slice on which the sliver was recently (< 20 minutes) -deallocated, the deallocated, "dirty" sliver will be reinstated into -the slice instead of a pristine, newly created vserver. - -* Disappearing Slivers - -Sometimes slivers in a particular slice will be blown away and replaced by -clean vservers, destroying any OOB data/state previously setup in the sliver. - -* Incomplete sliver setup: - -Even after InstantiateSliver() returns successfully for a particular -sliver, there are times when the sliver is not accessible via ssh. -Its likely that boss' public key did not properly get associated, or -the sliver didn't actually get created. There are likely other -failures that can be attributed to this problem which I have not -identified here. - -* Disappearing Nodes: - -As PLC has no callback mechanisms to alert users of changes in node status, -the occasional PLC node ID change or removal will cause the corresponding -sliver to get silently yanked out from under the slice. - diff --git a/doc/plab/Outline b/doc/plab/Outline deleted file mode 100644 index 7408ff83ae..0000000000 --- a/doc/plab/Outline +++ /dev/null @@ -1,71 +0,0 @@ -Evolving design_document/techreport/paper on Emulab's frontend to Planetlab. - -It will start as mostly "features" and "design/implemenation", -with the design/impl being an informal account of how -it works, as an internals document for us. - -That will be releasable/public. - -It will evolve into a TR, also releasable. - -Then some version of it will evolve into a real paper -(with eval, more spin, etc). - - -OUTLINE - -Title: Emulab's PlanetLab Backend - % Emulab's Portal to PlanetLab - -Introduction - -Motivation - -Goals/Requirements - -Features - Also describe it in terms of ``services'', from llp's 10/03 slide - - His service taxonomy: - Slice == Experiment Portal - Create expt/slice - Resource Discovery [w/ Monitoring Service] - Resource Allocation - Boot Slice [w/ Environment Service] - - Maintain expt/slice - Software Upgrades [w/ Environment Service] - Monitor Health [w/ Monitoring Service] - Project membership - - *Control expt/slice - Node - Expt - - * = Not in llp's taxonomy; Elab-only - - As of 10/03 Plab people were developing; did not have anything: - Environment Service - Monitoring Service - Resource allocation - - -Design -[for first version, probably keep design and impl together, for -ease of brain-dumping] - - Rob: - -Assign and node selection - -SW distrib and update (state mgmt, hierarchical) - -Startup commands - (Wide-area event system) - -Implementation - -Evaluation - SWE: - Integration with rest of Elab's design and code - dslice perf - PLC/NM perf - -Conclusion diff --git a/doc/plab/abs.tex b/doc/plab/abs.tex deleted file mode 100644 index 2bc5ad8277..0000000000 --- a/doc/plab/abs.tex +++ /dev/null @@ -1,3 +0,0 @@ -\begin{abstract} - -\end{abstract} diff --git a/doc/plab/abs.txt b/doc/plab/abs.txt deleted file mode 100644 index b2b4fc9864..0000000000 --- a/doc/plab/abs.txt +++ /dev/null @@ -1,17 +0,0 @@ -Service oriented architectures, such as Planetlab, become much more -attractive to the user community when a unified and simple, but -flexible user interface is available. Aside from masking the -underlying details and complexities, the infrastructure supporting -this interface should be fault-tolerant and [resource-efficient]; -attempting to recover and (re)distribute resources in order to meet -user requests while minimizing system impact. Emulab provides such an -interface to Planetlab, exposing its resources in a powerful, -straightforward fashion. In the process of creating this interface -and tracking Planetlab's evolving interface, we have identified -several facilities we see as key components for a [Resource Platform] -to provide to external testbed integrators. We have also exposed API -semantics and resource management techniques that hinder integration -with testbed architectures such as Emulab with relatively rapid -experimental cycles. - -* Need better terms for the ones in brackets above. diff --git a/doc/plab/conclusion.tex b/doc/plab/conclusion.tex deleted file mode 100644 index 5e874985bd..0000000000 --- a/doc/plab/conclusion.tex +++ /dev/null @@ -1 +0,0 @@ -\section{Conclusion} diff --git a/doc/plab/defs.tex b/doc/plab/defs.tex deleted file mode 100644 index c817a2c284..0000000000 --- a/doc/plab/defs.tex +++ /dev/null @@ -1,135 +0,0 @@ -% Something to do before final version -\newcommand{\ToDo}[1]{\par{{\bf ToDo:} \sl #1}\par} -% \newcommand{\ToDo}[1]{} - -% We don't need no steenkin' equations - just gimme a working underscore! -\catcode`\_=\active - -\long\def\note#1{{\em {\bf Note: } #1}} - -\long\def\toolong#1{} % Stuff omitted for space reasons - -% Permanently commented out stuff. -\long\def\comment#1{} - -% Temporarily commented out stuff. -\long\def\com#1{} - -% Outline material -\long\def\outline#1{} % out for now - -\long\def\TEMP#1{} - -%% Stuff not to be included in public version of proposal. -% Form for full version -\long\def\private#1{#1} - % Form for public version % \long\def\private#1{} - -%% Stuff to be included only in public version of proposal. -% Form for full version -\def\public#1{} - % Form for public version - % \def\public#1{#1} - -% Scary stuff weasel out of committing to. -\def\weasel#1{} - -% Detail stuff, omit. -\long\def\detail#1{} - -% These questions need to be resolved before any kind of publication. -\long\def\xxx#1{{\em {\bf Fix: } #1}} -% \long\def\xxx#1{} - -\newcommand{\towrite}[1]{~~~~\emph{To write: #1}} - -\def\xcite#1{[#1]} - -%% For your last-minute space massaging needs, I present \captionsize -% \def\captionsize{\footnotesize} -\def\captionsize{\small} - -\newcommand{\etal}{{\it et al.}\xspace} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -\newcommand{\defterm}[1]{\emph{#1}} - -\newcommand{\Emulab}{Emulab\xspace} -\newcommand{\emulab}{\Emulab} -\newcommand{\emulabnet}{emulab.net\xspace} -\newcommand{\Emulabnet}{Emulab.net\xspace} -\newcommand{\Netbed}{Netbed\xspace} -\newcommand{\netbed}{\Netbed} -\newcommand{\netbedorg}{netbed.org\xspace} -\newcommand{\Netbedorg}{Netbed.org\xspace} - -\newcommand{\ns}{{\it ns}\xspace} - -\newcommand{\plab}{PlanetLab\xspace} -\newcommand{\dslice}{{\it dslice}\xspace} - -\newcommand{\boss}{\texttt{masterhost}\xspace} -\newcommand{\users}{\texttt{usershost}\xspace} -\newcommand{\fserver}{\texttt{fileserver}\xspace} -\newcommand{\tipserv}{\texttt{tipserv}\xspace} - -\newcommand{\tmc} {{\sc tmc}\xspace} -\newcommand{\tmcc} {{\sc tmcc}\xspace} -\newcommand{\tmcd} {{\sc tmcd}\xspace} -\newcommand{\tmccd}{{\sc tmcc/d}\xspace} - -\newcommand{\capture}{\code{capture}\xspace} - -\newcommand{\janos}{{\sc Janos}\xspace} -\newcommand{\Janos}{{\sc Janos}\xspace} -% -% \newcommand{\janos}{Janos\xspace} -% \newcommand{\Janos}{Janos\xspace} -% -% Boldface variations, for use in titles, etc. -% -\newcommand{\janosbf}{\textbf{\textsc{Janos}}\xspace} -\newcommand{\Janosbf}{\textbf{\textsc{Janos}}\xspace} - -\newcommand{\Khazana}{Khazana\xspace} - -\newcommand{\Knit}{Knit\xspace} -\newcommand{\Jiazzi}{Jiazzi\xspace} -\newcommand{\MzScheme}{MzScheme\xspace} - -\newcommand{\oskit}{OSKit\xspace} -% Virtual memory -\newcommand{\vm}{{\sc vm}} - -\newcommand{\NPM}{NPM\xspace} -\newcommand{\NPMJava}{\NPM Java\xspace} -\newcommand{\JavaNPM}{the Java \NPM} -\newcommand{\JavaNPMCap}{The Java \NPM} - -\newcommand{\spin}{\mbox{\sc Spin}} -% \def\xkernel{{\it x}-kernel} -\newcommand\xkernel{{\it x}-kernel\xspace} - -\newcommand{\posix}{{\sc posix}\xspace} - -\newcommand{\mlos}{ML/OS\xspace} -% \def\sros{SR/OS} -% \def\javaos{Java/PC} - -% NSF -% Replace with whatever platform we decide on -% \alpha is already a tex command. -\newcommand{\alphax}{DEC Alpha\xspace} -\newcommand{\Alphax}{DEC Alpha\xspace} -\newcommand{\alphaxbf}{\textbf{\textsc{DEC Alpha}}\xspace} -\newcommand{\Alphaxbf}{\textbf{\textsc{DEC Alpha}}\xspace} - -\newcommand{\code}[1]{\texttt{#1}} - -% ``Section'' is ambiguous-- I believe it means the lettered ones. - -\def\psinc(#1,#2)#3{ -\setlength{\unitlength}{1in} -\centering\begin{picture}(#1,#2) -\put(0,0){\special{#3}} -\end{picture}} diff --git a/doc/plab/design.tex b/doc/plab/design.tex deleted file mode 100644 index 26aa86adbe..0000000000 --- a/doc/plab/design.tex +++ /dev/null @@ -1,13 +0,0 @@ -\section{Design} -\label{design} - -\xxx{Maybe keep this and impl together to start with, - for ease of brain-dumping.} - - --Assign and node selection \towrite{Rob} - --SW distrib and update (state mgmt, hierarchical) - --Startup commands - (Wide-area event system) diff --git a/doc/plab/eval.tex b/doc/plab/eval.tex deleted file mode 100644 index fb1af441b2..0000000000 --- a/doc/plab/eval.tex +++ /dev/null @@ -1,15 +0,0 @@ -% Informal experience to start with, later more evaluation, -% or turn into evaluation. - -\section{Experience and Evaluation} -\label{experience} -\label{eval} - - - SWE: - Integration with rest of Elab's design and code - - dslice perf - - PLC/NM perf - diff --git a/doc/plab/features.tex b/doc/plab/features.tex deleted file mode 100644 index 80d9f07912..0000000000 --- a/doc/plab/features.tex +++ /dev/null @@ -1,110 +0,0 @@ -\section{Features} -\label{features} - -\small{ -\begin{verbatim} -Also describe it in terms of ``services'', from llp's 10/03 slide - -His service taxonomy: -Slice == Experiment Portal - Create expt/slice - Resource Discovery [w/ Monitoring Service] - Uses info from monitoring service - Monitoring service: - Node liveness, node resource availability (cpu/mem via loadavg, disk), - e2e health, pair-wise paths. - Resource Allocation - Uses info from monitoring service - Three ways, can be mixed - fixed, site/node-centric (w/ types), link-centric [works? no tunnels?] - port space (ssh; extend) - Can mix with Emulab nodes (works?) - - *Admission control and optional queuing (queuing and retry service) - - Boot Slice [w/ Environment Service] - - Node/slice management (``environment service'') - Node initialization - Emulab state, per-user state - Startup command - - Maintain expt/slice - Software Upgrades [w/ Environment Service] - Monitor Health [w/ Monitoring Service] - Project membership - - *Control expt/slice - Node - Expt - - *Naming service - Virtual (DNS) or physical - - *Admin management service/interface - Resource alloc params - Admission control parameters - Queue retry params - - -* = Not in llp's taxonomy; Elab-only - -As of 10/03 Plab people were developing these; did not have anything: - Environment Service - Monitoring Service - Resource allocation -\end{verbatim} -}%small - - - -\subsection{Creating slices} - -\subsubsection{Resource assignment} - -Users of Emulab's PlanetLab interface have three choices for selecting the -nodes that will be used for their slice. - -The most basic method is to manually chose nodes, as is done with the -PlanetLab's own current interface. This is accomplished by using Emulab's -'tb-fix-node' syntax. In addition to letting the user manually select nodes, -this method gives the user the opportunity to run their own selection algorithm -before submitting their experiment specification to Emulab. - -Second, node selection can be done in a link-centric fashion. In this method, -the user specifies a set of virtual nodes, and a set of virtual links between -these nodes. Each virtual link can have a bandwidth, latency, and packet loss -rate specified for it. Emulab then matches, as best it can, these desired to -link characteristics to end-to-end characteristics observed between PlanetLab -nodes, gathered from third-party sensors. Emulab uses a custom mapper based on -a genetic algorithm to find a good match. If support for creating tunnels is -added to PlanetLab at some point in the future, we will optionally set up -tunnels to create an overlay network that matches the experimenter's virtual -topology. - -The third, and most often used method of node selection is node-centric. In -this scheme, users ask for a set of virtual nodes, with no specific links -between them. Emulab's type system is used to select nodes with varying levels -of specificity - for example, a user can ask for a 'pcplab' node to get any -node in PlanetLab. Or, they can ask for a 'pcplabinet' node to get a node on -the commodity Internet, or a 'pcplabdsl' node to get a node on a DSL line. -This is done using Emulab's 'tb-set-node-hardware' syntax. - -This data is then fed to Emulab's resource mapper, assign. Assign uses -simulated annealing, a randomized heuristic algorithm to find a good mapping. -(See "A Solver for the Network Testbed Mapping Problem", Ricci, Lepreau, -Alfeld, SIGCOMM CCR April 2003.) Assign takes into account several mapping -goals, and can easily be expanded to take others into account as well. At the -present time, it attempts to balance the following goals: - -\begin{itemize} -\item Meeting type constraints mentioned above -\item Spreading slices out over the maximum possible number of distinct sites -\item Choosing nodes with low CPU and memory loads---Emulab can do admission - control for experiments, if there are not enough planetlab nodes available - with a specified CPU load or free memory -\end{itemize} - -\subsection{Maintaining slices} - -\subsection{Controlling slices} diff --git a/doc/plab/from-nsf-report.tex b/doc/plab/from-nsf-report.tex deleted file mode 100644 index 74a2327852..0000000000 --- a/doc/plab/from-nsf-report.tex +++ /dev/null @@ -1,75 +0,0 @@ -\begin{asparadesc} - -\item[Integration with and improved interface to PlanetLab.] -%% Note: doesn't directly correspond to any of our proposed milestones. -%% -%% Original author: Kirk Webb -%% -% Work on the Emulab portal to PlanetLab -% -In the interest of adding PlanetLab as an Emulab resource, we have put forth -significant development effort in bringing PlanetLab resources to Emulab -experimenters. We have also developed a powerful, easy to use interface -targeted at native PlanetLab experimenters. - -We provide a rich set of resource allocation semantics for experimenters, -including selection based on last-mile link type, load conditions, and -administrative/geographic domain (e.g., an experimenter can choose to allocate -a slice that spans one node at each PlanetLab ``site''). Our sophisticated -resource mapping program (assign) uses these specifiable criterion to select -the set of nodes to use. - -Underneath, Emulab is able to dynamically discover new PlanetLab nodes, -classify their last-mile link using an inference heuristic, and add them to its -database. Once discovered, these nodes are immediately available for use by -experimenters. Emulab also attempts to minimize the significant delays present -in PlanetLab vserver setup by running many instances of the initialization and -bootstrap processes in parallel. - -To conserve resources and provide ongoing statistical feedback, and support -rich Experimental setup features, Emulab sets up a service vserver (sliver) on -every PlanetLab node it becomes aware of. This service sliver reports node -load, provides liveness testing, and stores common vserver overlay files which -are utilized when a new Emulab-managed PlanetLab sliver is setup on a -particular node. - -Over the past year, Emulab has tracked the RPC interface provided by PlanetLab, -which has been subject to abrupt change with no transition period. Emulab's -PlanetLab back-end has been modularized for easy porting to new API schemas, -and has accounted for a wide range of resource and credential handling, -including resource tokens and per-slice authentication keys. - -Additionally, due to the inherent instability in research platforms such as -PlanetLab, Emulab's back-end to it has been hardened to handle a variety of -failure modes (with retry). Node outage, RPC call hangs, and other abnormal -termination and resource malformation (e.g., half setup vservers) have been -accounted for. When we find that a PlanetLab node is problematic, we move it -out of production into a special pool where a back-end daemon at Emulab central -will test it. This daemon will continually cycle through the nodes in the down -pool, testing them in batches. When a node successfully runs through an end to -end PlanetLab vserver setup, it is released back into production. - -\item[Planetlab EZ interface.] -%% Note: doesn't directly correspond to any of our proposed milestones. -%% -%% Original author: Rob Ricci -%% -One of the important advantages offered by \emulab's interface to PlanetLab -over PlanetLab's own native interface is an easy-to-use web interface for -configuring experiments (called ``slices'' in PlanetLab terminology). In -contrast to the PlanetLab interface, which requires users to select individual -nodes manually, our interface also allows the option to ask for a certain -number of nodes, and \emulab automatically decides which ones to use. Since a -substantial number PlanetLab nodes are frequently down, and some often see -heavy use, \emulab constantly monitors PlanetLab and only assigns nodes which -are up and have a reasonable amount of load. The Web interface also provides a -mechanism for distributing files to every node in the experiment and for a -command to be run on each node at startup. It provides an easy way to start an -experiment that uses every (currently up) node in PlanetLab, or one node at -each site. - -\end{asparadesc} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%% End of file. diff --git a/doc/plab/goals.tex b/doc/plab/goals.tex deleted file mode 100644 index 3607bcc1d5..0000000000 --- a/doc/plab/goals.tex +++ /dev/null @@ -1,2 +0,0 @@ -\section{Goals [and/or ``Requirements'']} -\label{goals} diff --git a/doc/plab/impl.tex b/doc/plab/impl.tex deleted file mode 100644 index 12a4654ffb..0000000000 --- a/doc/plab/impl.tex +++ /dev/null @@ -1,207 +0,0 @@ -\section{Implementation} -\label{impl} - -\subsection{Planetlab in the Emulab Framework} - -~ - -* Emulab has an experiment life cycle - - swapin, swapout, terminate - -* Plab specific functions happen at alloc, setup, and swapout/teardown. - -\subsection{Emulab's Modular Planetlab Backend} - -As \plab is an evolving system, its interfaces often undergo radical -change. To cope with this variability, Emulab's \plab backend is -modular, supporting two different \plab API frontends, with a third on -the horizon. While the mechanisms to carry out slice creation and -vnode allocation have changed over time, the overall process and -abstractions have maintained enough consistency to make different -backend modules with a common management core possible. Each module -is expected to implement a set of functions callable by the core logic -that follow defined semantics for operations such as slice -creation/deletion, key management, and node allocation/deallocation. - -\xxx{probably merge with Node Alloc section} -\subsection{Slice Creation} - -When an experimental specification includes Planetlab vserver nodes, a -slice is created during experiment setup. The mechanism by which this -happens is selectable; PLC and dslice are currently supported. Direct -Node Manager creation will be implemented when this interface stabalizes. -The process is synchronous and must succeed for experiment creation to -continue. On failure, experimental setup is terminated. Details related -to the slice are stored in the Emulab database (e.g., when it -expires). - -\subsection{Node Allocation and Setup} - -As experiment creation (swapin) procedes from slice creation, virtual -nodes are created and setup. On local Emulab physical nodes, these -virtual nodes are created as part of the boot time setup sequence on -the physical node itself. Callbacks are made to the Emulab central -server (the boss node) to determine whether virtual machine resources -need to be allocated on the physical host node. - -The Planetlab backend differs in an important way; we don't have -direct control over the physical machine, and must therefore take an -additional step to allocate a virtual machine on a particular node. -These physical nodes do not boot up as part of an Emulab experiment -swapin, rather, they are run independently and may house other, -unrelated virtual machines for slices created both outside of Emulab, -and through Emulab. Emulab's Planetlab backend coordinates with -Planetlab to allocate a new virtual machine on each \plab physical -node assigned to the Emulab experiment. As mentioned, the \plab -backend is structured in a modular fashion to allow communication with -current and future \plab frontends. As some \plab frontends have high -latency characteristics (order of minutes), Emulab attempts to speed up -the allocation process by performing several allocation calls in -parallel. The software keeps a parallelism window as full as possible -until all allocation RPCs have been made to \plab. The size of this -window has been empircally tuned using RPC failure rates and -anticipated load on the Emulab side (e.g. expected maximum number of -simultaneous \plab experimental swapin attempts). - -After \plab vnode allocation, Emulab's \plab backend prepares the -vnode for setup by doing two things. First, it transfers and overlays -a package containing a set of files needed for Emulab setup, including -the Emulab vnode startup scripts, configuration files, and other -supporting binaries. The vnode specific ssh daemon is then started. - -After vnode resources are successfully secured, Emulab proceeds to -with vnode startup. This process is invoked in the same way for both -\plab vnodes, and for local Emulab vnodes; the Emulab boss node -contacts the vnodes via ssh and executes the startup script. Once the -top-level vnode setup script has activated, it sends a state change -message (BOOTING) back to the Emulab boss node. On \plab vnodes, this -script has additional responsibilities. It starts logging daemons, -creates Emulab-specific directories, and ultimately stays active in -order to coordinate node reboot requests. Vnode-agnostic services -such as a node watchdog are also started, user packages are installed -(rpms and tarballs), and startup commands are executed if applicable. -Once all daemons and configuration activity have completed, it sends -another state change message to let Emulab know that the vnode is -ready (ISUP). - -\subsection{Node and Slice Deallocation} - -When an Emulab experiment containing \plab nodes is swapped out or -torn down, part of the process includes releasing the \plab vnode -resources allocated to that experiment. First Emulab contacts each -vnode and signals a waiting script that the node needs to be halted. -When this has been completed on a vnode, the Emulab boss node will -receive a state change message from the script just before it exits -indicating that the shutdown process is complete (SHUTDOWN). Next, -the Emulab \plab backend frees \plab node. After all nodes have been -shutdown and freed, the backend frees the slice. - -\subsection{Dslice Semantics} - -The original \plab resource allocation and management API and -framework created by Brent Chun was known as \dslice. Its model was -mainly decentralized, only involving a central ticket broker for -obtaining allocation rights for individual nodes. Slice creation was -implicit; there was no central notion of a slice although tickets were -marked with a slicename and node identifier. - -The Emulab \dslice \plab backend module made XMLRPC calls to the -central \dslice ticket agent for each \plab node in the Emulab -experiment, asking for a maximum resource lease. After obtaining a -ticket, the \dslice backend then spoke with a node manager running on -the corresponding physical \plab node, also via XMLRPC calls. It -would present the node manager with a ticket and obtain a lease in -response. The node manager would also create a Linux vserver as the -virtual machine resource; this process normally pulled the vserver -from a preallocated pool, making it quite fast. Once the lease -requisition RPC completed, Emulab was free to interact with the -vserver as available through \plab's slice demultiplexing ssh (keyed -on slice name). Before proceeding with setup, however, Emulab would -use the \dslice node manager API to add the ssh public key for the -Emulab management user. This would allow the programmatic interaction -of the Emulab boss node with the \plab vnode for final setup, and -later vnode interaction and teardown. - -When the \plab experiment came to completion, and the experimenter -requested swapout or termination via Emulab, the Emulab \dslice module -would simply revoke the lease for each \plab vnode sliver by -communicating with the individual \dslice node managers. - -By design principles, the lease allocation would fail if the -node was oversubscribed, and tickets would require payment via shares -delegated to the experimenter from \plab. However, the \dslice -implementation did not track shares, and did not enforce or track -resource usage. - -\subsection{PLC Semantics} - -When \plab transitioned to their 2.0 platform, they deprecated dslice -in favor of their new centralized management API dubbed PLC (PlanetLab -Central). With PLC, all operations/RPCs are coordinated with a -central server. The server then coordinates resource allocation on -the physical nodes in the background. Most PLC API calls do not -effect immediate updates on the \plab nodes, but rather queue the -operations and let a PLC node manager running on each \plab node pull -them periodically (originally every half hour). In PLC, a slice is a -concrete entity that must be created with an API call. The Emulab PLC -\plab backend module does this, and also attaches shares to the slice -via the XMLRPC PLC API frontend. Additional \plab users can also be -granted permission to the slice, and the Emulab PLC module uses this -feature to add the Emulab management user to the slice. This, as in -\dslice, allows for future automated access to the virtual machines in -the slice via ssh. - -Also in contrast to \dslice, allocating the virtual machine resources -is accomplished through the PLC central server. Originally, there was -no programmatic way to determine when a particular vnode allocated -through PLC was ready. The Emulab PLC module was forced to either -wait the predefined maximum polling interval to allow the \plab nodes -to pull their latest vnode resource allocations from PLC, or poll for -readiness by trying periodically to ssh to them. The PLC API was -subsequently extended to include a call (InstantiateSliver) that would -block until the a specified vnode was ready. It would further call -out to the PLC node manager on the physical \plab node to elicit an -immediate poll. The Emulab PLC module was changed to use this call to -programmatically determine when the \plab vnodes were ready during -experiment swapin. - -At experiment swapout or teardown time, the Emulab PLC module makes a -call to the PLC central server to remove the slice. This effectively -releases all the resources, and eventually causes the PLC node manager -on each node participating in the slice to reclaim the virtual machine -used for the corresponding slice. - -\subsection{Failure Handling and Recovery} - -Given two large distributed systems such as Emulab and \plab, failures -are a given and have many possible modes. We apply several mechanisms -to cope with these. The \plab backend defines wrapper functions that -call a requested remote API function and handle error conditions -encountered. There are three types of errors the handler can cope -with: fatal, retryable and continuable. On detecting a fatal error, -the backend halts the current operation and reports failure back to -the caller. For retryable error types, the wrapper will try the RPC -again; by default, the RPC wrapper will attempt a remote procedure -three times before giving up. Continueable errors are cases where the -error indicates that the goal has already been acheived (e.g., when a -node deallocation RPC reports that a node is no longer allocated). -The classification of these errors is defined in software; there is no -heuristic to determine when to continue or give up. The default -error classification is retryable. - -The outer Emulab infrastructure combined with the \plab backend track -the resources that are in use at any given time (swapin, active, -swapout). The \plab backend gaurantees not to leave slices or nodes -allocated when their allocation ultimately fails. When a setup fails -or is canceled further into swapin, the Emulab infrastructure takes -care to call the appropriate \plab backend commands to free any -allocated resources. For example, when a \plab experiment setup fails -because some nodes fail to allocate or load and run the Emulab -client-side startup scripts (and setup failure is set to fatal for -these nodes), a full Emulab experiment termination will be activated. -This will result in the deallocation of any resources; \plab nodes -will be freed by whichever backend module is appropriate, and the -slice will be destroyed. No resources are leaked, and namespaces are -cleared so that future setups will not collide. - -\xxx{Talk about timeout handling} diff --git a/doc/plab/intro.tex b/doc/plab/intro.tex deleted file mode 100644 index f58ad88b94..0000000000 --- a/doc/plab/intro.tex +++ /dev/null @@ -1,6 +0,0 @@ -\section{Introduction} -\label{intro} - -\paragraph{Motivation} - -\input{from-nsf-report} diff --git a/doc/plab/outline-new.txt b/doc/plab/outline-new.txt deleted file mode 100644 index 261242e1b0..0000000000 --- a/doc/plab/outline-new.txt +++ /dev/null @@ -1,132 +0,0 @@ -\newcommand{\pliface}{Emulab-Planetlab User Interface} -\newcommand{\emulab}{Emulab} -\newcommand{\plab}{Planetlab} -\newcommand{\plapi}{Planetlab API} -\newcommand{\plciface}{Planetlab Central User Interface} -\newcommand{\plcapi}{Planetlab Central API} -\newcommand{\plcnmapi}{PLC / Node Manager Hybrid API} -\newcommand{\dslice}{Dslice} - -\pliface paper outline - -I. Introduction - - * Motivation for providing \pliface - - bring new set of resources to \emulab - - \plab lacks resource discovery, automated setup, admission control - - \plciface cumbersome and incomplete - - * \pliface's role in llp service taxonomy and beyond - - environment service - - ... - - * Challenges assoc. with integration and interface presentation - - Instability of \plapi - - Tracking the evolving \plapi - - Matching interface mechanics to native \plab user expectations - + simple - -II. Essential integration mechanisms (maybe fold into intro section) - - * stuff we think is needed (and why) - - Timely API calls (synchronous) - - Strong authentication based on PKI - - Meaningful feedback - - Push or Pull setup supported by infrastructure - - Reliable liveness data - - Resource share transferability - - Low overhead vserver creation. - - Flexible 1st class authentication flexibility - - Stable node identification and reliable migration detection - - .. \XXX most of the above are what's missing in \plcapi, need - to identify other components that are there and work fine. - - * What's missing/wrong in \plab (past, present, future(?)) - - dslice - - \plcapi - + WRONG: async and highly unreliable sync instantiation interfaces - - \plcnmapi - - future (PDN 04-17) - - * What's superfluous in \plab - - ... - -III. \pliface Features - - * resource discovery - - * resource tracking - - * admission control - - * failover via retry on setup - - * more than one interface - - web-based wizard (\plab EZ) - - NS file (more powerful) - - * node selection based on various criterion (not all avail via EZ interface) - - manual specification a la \plciface - - last-mile link type - - load metrics (resources used/avail) - - pairwise link characteristics - - mixtures of the above - - all nodes - - one node at each site - - * speed up instantiation by performing parallel setup - - impact on \plcapi .. - - scaling properties - -IV. Failure modes, and coping strategies - - * node unreachability - - * half-baked or dirty vservers - - * RPC call hangs - - * disappearing/wiped vservers - - * vserver startup scripts not executed - - * retry/timeout - - * hand-enumerated error conditions - - detect fatal failures - - proceed when failure really indicates success - - * automatic transitioning in/out of down pool - - e2e testing. - -V. \plapi Comparison - - (\dslice vs. \plcapi vs. \plcnmapi) - - * Setup timing - - * Fault ratios/probability - -VI. Future Work - - * Persistent vnode recovery - - keep tabs on vservers, and recover them when they die. - - we currently do nothing along these lines outside of setup - - * start using newish sensor data instead of home-rolled stat gathering - - * scale up large file/overlay distribution using something like CoDeploy - - * handling shares - - who provides the shares and who presents them - - us vs. them - - transference - -VII. Conclusions - - * dslice worked pretty well, \plcapi sucks, \plcnmapi is moving back - toward dslice semantics, which were already doing the right thing - according to PDN 04-17 before they were dumped. - - * eat your own dogfood / level playing field. - diff --git a/doc/plab/pgformat.tex b/doc/plab/pgformat.tex deleted file mode 100644 index 2f7e18436d..0000000000 --- a/doc/plab/pgformat.tex +++ /dev/null @@ -1,104 +0,0 @@ -\com{Take some (not all) of this later when need to make it look prettier. - -% This first one shrinks the text font size also, no good. -% Only want headings and space around them shrunk. -\makeatletter -%as Latex considers descenders in its calculation of interline spacing, -%to get 12 point spacing for normalsize text, must set it to 10 points -%\def\@normalsize{\@setsize\normalsize{12pt}\xpt\@xpt -%\abovedisplayskip 10pt plus2pt minus5pt\belowdisplayskip \abovedisplayskip -%\abovedisplayshortskip \z@ plus3pt\belowdisplayshortskip 6pt plus3pt -%minus3pt\let\@listi\@listI} - -%need an 11 pt font size for subsection and abstract headings -\def\subsize{\@setsize\subsize{12pt}\xipt\@xipt} - -%make part titles bold and 12 point, <2 blank lines before, <1 after -\def\part{\par - \addvspace{4ex} - \@afterindentfalse - \secdef\@part\@spart} - -\def\@part[#1]#2{\ifnum \c@secnumdepth >\m@ne - \refstepcounter{part} -% \addcontentsline{toc}{part}{\thepart -% \hspace{1em}#1}\else -% \addcontentsline{toc}{part}{#1}\fi - \fi - {\parindent \z@ \raggedright - \interlinepenalty \@M -% \ifnum \c@secnumdepth >\m@ne -% \Large \bf \partname~\thepart% THEN Print '\partname' and -% \par\nobreak -% \fi -% \Large \bf -% #2% -% \markboth{}{}\par - }\nobreak - \vskip 3ex - \@afterheading - } - -%make section titles bold and 12 point, <2 blank lines before, <1 after -\def\section{\@startsection {section}{1}{\z@}{-7pt plus -1pt minus -1pt} -{2pt plus 1pt minus 1pt}{\large\bf}} - -%make subsection titles bold and 11 point, 1 blank line before, 1 after -\def\subsection{\@startsection {subsection}{2}{\z@}{-4pt plus -1pt minus -1pt} -{2pt plus 1pt minus 1pt}{\normalsize\bf}} - -\def\subsubsection{\@startsection {subsubsection}{3}{\z@}{-4pt plus -1pt minus -1pt} -{1pt plus 1pt minus 1pt}{\subsize\bf}} - -\def\paragraph{\@startsection - {paragraph}{4}{\z@}{2pt plus2pt minus2pt}{-2pt}{\reset@font - \subsize\bf}} -\def\subparagraph{\@startsection - {subparagraph}{4}{\parindent}{2pt plus2pt minus - 2pt}{-2pt}{\reset@font\subsize\bf}} - -\makeatother - -}%com - -\com{Don't need space yet. -%% Every line helps. The default, BTW, is 10. I wouldn't go too much higher -%% than 100 or so. -%% -%% I should say that this will increase the average CPI, though, possibly -%% putting the text out of compliance with the typesetting rules of the NSF's -%% GPG. Caveat emptor. -%% -\linepenalty=150 - -%%%%%%%%%%%%%%%%%%%%%%% - - \newenvironment{changemargin}[2]{\begin{list}{}{ - \setlength{\topsep}{0pt}\setlength{\leftmargin}{0pt} - \setlength{\rightmargin}{0pt} - \setlength{\listparindent}{\parindent} - \setlength{\itemindent}{\parindent} - \setlength{\parsep}{0pt plus 1pt} - \addtolength{\leftmargin}{#1}\addtolength{\rightmargin}{#2} - }\item }{\end{list}} - - -% \setlength{\textheight}{8.9in} -% \setlength{\textwidth}{6.35in} -% \setlength{\topmargin}{-0.5in} -% \setlength{\oddsidemargin}{.1in} -% \setlength{\evensidemargin}{.1in} - -\itemsep=-10pt -}%com - -\usepackage{fullpage} - -% \raggedright - -\flushbottom % optional but recommended - -% \parskip .5ex - -% \setlength{\marginparwidth}{0in} -% \setlength{\marginparsep}{0pt} diff --git a/doc/plab/plab.tex b/doc/plab/plab.tex deleted file mode 100644 index 3160fda644..0000000000 --- a/doc/plab/plab.tex +++ /dev/null @@ -1,82 +0,0 @@ -%%%%% See file ``Outline'' - -% Why don't get date out? - -\documentclass[11pt]{article} -% \documentclass[twocolumn, 10pt]{article} - -%\usepackage{usenix} - - -\usepackage{times} -\usepackage{xspace} -\usepackage[footnotesize]{caption} -\usepackage{graphicx} -\usepackage{paralist} -\usepackage{fancyvrb} -\usepackage{amssymb} -\usepackage{amsbsy} -\usepackage{time} - -\input{defs} - - -\input{pgformat} - -% override some cheating in pgformat to match OSDI/USENIX requirements -\setlength{\textheight}{9.15in} -\setlength{\textwidth}{6.55in} -\setlength{\columnsep}{12pt} -\setlength{\footskip}{30pt} -\setlength{\topmargin}{0.0in} -\setlength{\headheight}{0.0in} -\setlength{\headsep}{0.0in} -\setlength{\oddsidemargin}{0in} -%\setlength{\parindent}{0pc} -%\setlength{\parskip}{\baselineskip} - - -% Make title bold and 14 pt font (Latex default is non-bold, 16 pt) -% {\LARGE \bf D R A F T}\\[1ex] - -\title{Emulab's PlanetLab Backend} -% Emulab's Portal to PlanetLab - -\author{The Emulab Team\\[1ex] - University of Utah -} - -\begin{document} - -\maketitle - -\input{abs} - -\input{intro} - -\input{goals} % or Goals and Requirements - -\input{features} -% Also describe it in terms of ``services'' - -\input{design} - -\input{impl} % Implementation - -\input{eval} % Experience and Evaluation - -\input{conclusion} - -\subsection*{Acknowledgements} -% Austin Clements (probably will be author), -Brent Chun, Mic Bowman, Steve Muir, Larry Peterson, other? -% Culler et al for pushing Plab? -% David Andersen? - -\bibliographystyle{capsabbrv} -{\footnotesize -\itemsep=0pt -\bibliography{sys,emulab,shash,ricci}%,prop,alg,agile,newbold} -} - -\end{document} -- GitLab