TODO 9.79 KB
Newer Older
1 2
[This file is not kept entirely up to date.]

Leigh B. Stoller's avatar
Leigh B. Stoller committed
3 4 5 6 7 8 9 10 11 12 13
* Other items. We better start saving the thumbnails in the experiment
  archive directory too (expinfo). We should also be re-rendering after a
  modify. We should also save the XML representation to avoid having to
  reparse old NS files, although there is some versioning issues with this. 

* Auto discovery of new nodes. 

* Change netbuild to speak XML (in both directions).

  Related: Might require addition of a DTD or Schema to our XML format.

14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
* From: Jay Lepreau <lepreau@cs.utah.edu>
  Subject: Snapping an image - physnode menu item
  Date: Tue, 01 Jul 2003 17:39:36 MDT
  
  It's always seemed logical to me to have such a menu item on the
  (physical) "Node Information" page.
  In fact, several times I've looked for it there,
  only to remember it doesn't exist, and I need to go to
  "ImageIdS and OSIDs" (which is not an "action" item,
  and is not associated with an experiment).

* Email archive for project email lists. 

* Add NS file history, and links to view/create from old NS files.

	I think this might be pretty cool to do, and its not that
	hard. Change the nsfiles table to index by the unique
	experiment idx, instead of pid/eid, and add a copy of the
	original description field to it. Then link it off the users
	"Show History" link, which has his experiment log. We could
	load up the old NS files, but I think thats enough of a pain
	that I would not nother too.

	From Dave:

	> key by the md5 hash of the ns file and store it by that:
	> 
	> table 1:
	> exp pid date md5_of_file
	> 
	> table 2:
	> md5_of_file  contents_of_file
	> 
	> That has the advantage of keeping the commonly accessed 
	> table (table 1) small and statically sized, whereas the
	> slow and dynamic TEXT records are then in table 2.

	Could also hang it off the experiment_resources record, which
	would address the swapmod issue, since a new resource record
	is created at each swapmod.

* Secure the event system so that experiments cannot spoof/snoop each other.

* Fix case sensitivity of event system (and agents, scheduler).

* Fix idleswap auditing. 

* Aren't users required to fill in their first and last names?
  Also note the state.

Leigh B. Stoller's avatar
Leigh B. Stoller committed
64 65
*** Major:

66 67 68
* Fix event scheduler for experiment modify so that it can add new
  events from the current time index, instead of from time 0. 

69 70
* Break up emulab into smaller components (for example, split of
  account and group stuff so its independent.
Leigh B. Stoller's avatar
Leigh B. Stoller committed
71

Leigh B. Stoller's avatar
Leigh B. Stoller committed
72 73 74 75 76 77 78 79 80 81 82 83
* Fix the entire nalloc/nfree/reloading mechansism and the state
  control stuff for it that is scattered around nfree, tmcd, stated,
  and the reload daemon needs a complete overhaul. Many races, many
  oppotunities to fail. Mac is thinking about this.

* Event system startup cost. Abhijeet reported that after ISUP, it
  could take a very long time for events to start. This is because it
  takes a really long time to process the event stream in event-sched
  using Ian's original binary tree stuff. I hacked in a fix, but need
  to look at that algorithm and perhaps change. Need to decide if
  insertion needs to be optimized, over deletion.

84
* Continuing work on jails for both local and remote nodes.
Leigh B. Stoller's avatar
Leigh B. Stoller committed
85 86 87 88 89 90 91 92 93 94 95 96

* Need to default the OS id version (4.3, 7.1) if we are going to
  delay reloading, or else people can get old versions of the OS
  when in the same project (last_reservation). This might be moot
  depending on what we do wrt reloading when experiments are done.

* tmcd does not appear to be scaling with the advent of ssl. Rob
  suggested a combined tmcd command to return the entire node
  configuration in one message. We would still keep the individual
  calls, but provide a way to get all the data at once and save on a
  dozen connections per boot.

97 98 99
  LBS: local nodes using tmcc-nossl since we get security via the
  switch. Wideare nodes *do* use ssl.

Leigh B. Stoller's avatar
Leigh B. Stoller committed
100 101 102 103 104
* Complete event system overhaul (per-exp elvind, secure elvind,
  per-node elvind, distribution of event lists to nodes).
  Cannot multicast events to multiple agents at a time.

* Get the program agent working on ron nodes. This is related (and
105 106
  dependent) on securing the event system since we do not want anyone
  to be able to send ab event to the program agent from some random node!
Leigh B. Stoller's avatar
Leigh B. Stoller committed
107 108 109 110 111 112

* Deal with two ends of a remote link being allocated to the same ron
  node! Need to catch the situation for now (and error), but
  eventually make sure it does not happen since setting up a tunnel
  from a node to itself sound rather silly.

113 114
* Investigate other types of tunnels for wideare nodes. Perhaps ipsec
  AH tunnels.
Leigh B. Stoller's avatar
Leigh B. Stoller committed
115 116 117

* Fix mountd-invalidating-current-mounts problem.

118 119 120
* Automated swapping (with disk saving) support so that we can swap
  out experiments and save per-node images for people. Requires a lot
  of disk space.
Leigh B. Stoller's avatar
Leigh B. Stoller committed
121 122 123

*** Medium:

124 125
* Change batch system to handle limited-use disk images like timesys.

126
* CDROM changes:
Leigh B. Stoller's avatar
Leigh B. Stoller committed
127
    1) Add per host certificates.
128

129 130 131 132
* Switch rpm/tar file to non-nfs solution. Perhaps a ftpd like daemon
  which does some of the same checks that tmcd does. Or maybe a tmcd
  variant that does nothing but serve up files according. Maybe it
  does not need to be separate, but seems like putting this directly
Leigh B. Stoller's avatar
Leigh B. Stoller committed
133 134 135 136
  into tmcd is a bad idea. Maybe not.

  LBS: This is now done for tar files on widearea nodes, and in jails.
       Needs to be done with RPMs too.
137 138 139

* Clean up osid/imageid mess. 

140 141
* Need to add a "kill runnin frisbee" function so that creating new
  images does get frisbee messed up.
Leigh B. Stoller's avatar
Leigh B. Stoller committed
142

Leigh B. Stoller's avatar
Leigh B. Stoller committed
143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182
* Front end support for changing delay/bw/plr asymmetrically in
  events. Currently, we can do queue params, but the basic delay, bw,
  and plr and can be only be done symmetrically.

  Related: Add NS event support for some of the tb- commands. For
  example, tb-set-lan-simplex-params. This is actually harder, since
  lans are not directly controllable at the link level anywhere in the
  system. delay_config chokes on lans at the moment for this reason.

* Support images with more than once slice (but not the entire disk).
  At present, people can make use of the 4th slice, but cannot save it
  with an image, unless they create an entire disk image, and we do
  not allow mere users to do that. The current disk image stuff is not
  flexible enough to support arbitrary slice definitions (save slice
  2-4).

  Related: Kill deltas!

* Add MBR initialization to all images, perhaps as a special Frisbee
  operation, like Mike did for slice 4. This is to prevent problems
  with people messing up the MBR.

* Change hardwired degree 4 for vrons->rons to more flexible DB
  management. Related would be dynamic creation of virtual nodes
  instead of hardwired entries in the nodes table, but thats a lot of
  work, and might not be worth it.

* Script to remove old log files from the mysql directory. Also remove
  old backups from the mysql backup dir. These files are taking up a
  huge amount of space, and /usr/testbed/log has been filling up a
  lot.

* Add some kind of host table support to RON nodes so that programs
  can figure out IPs. This is going to be a pain.

* Support for protocols other than IP. Mike reported some issues
  related to this in email of Fri, 17 May 2002 10:05:41.

* Bring in a bug tracking system we can use from the web interface.
  Need someone to look around for this. I hate GNATS!
Leigh B. Stoller's avatar
Leigh B. Stoller committed
183 184
  Rob mentioned RT (http://www.fsck.com/projects/rt). Eric mentioned
  Bugzilla and Jitterbug
Leigh B. Stoller's avatar
Leigh B. Stoller committed
185

186 187 188
* Noswap bit to prevent users from swapping special experiments that
  have things like SPAN turned on.

Leigh B. Stoller's avatar
Leigh B. Stoller committed
189
*** Minor:
Leigh B. Stoller's avatar
Leigh B. Stoller committed
190

191 192 193 194 195 196
* When I syntax check an ns file, and it fails, it would be handy to have a
  one-click way to check the same file again.  (My Tcl isn't so good.)

* Change logs for group experiments from /proj/<project>/logs/' to
  `/groups/<proj>/<group>/logs/'.

197 198
* Clean up ISADMIN() and ISADMININISTRATOR() calls in php pages.

Leigh B. Stoller's avatar
Leigh B. Stoller committed
199 200 201 202 203 204
* > Mapping RHL-STD on pc92 to emulab-ops-RHL73-STD.
  > *** Tarfile /usr/local for node pc96 does not exist!

  Can't we check the validity of these paths during the parse phase
  and fail a lot sooner?

Leigh B. Stoller's avatar
Leigh B. Stoller committed
205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223
* Macrofy the signature of the email (currently "Testbed Ops").

* FAQ entry for lilo:
  To access partitions on the disk outside of the C:H:S tuple limit (8.4
  GB), you must add 'lba32' to the global options section.

  Not a big deal, but requires someone who knows lilo to verify and to
  test it.

* Fix "no networks link warning" to deal with remote node links.

* DB consistency checker; to run at night and as part of flest.

* I'm sitting here looking at the "details" page for an experiment and
  no where obvious on this page does it show the name of the
  experiment.  If I scroll all the way down to experiment details,
  there it is. How about putting it over the vis image or making it
  part of the vis image? Somewhere right at the top.

Leigh B. Stoller's avatar
Leigh B. Stoller committed
224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253
* allow user to specify OSIDs for their delay nodes. Not entirely sure
  how, since delays are chosen late in the game, but at the moment its
  difficuly for people to customize delay nodes.

* Fix up hostnames generation from tmcd for lans. Currently, if you
  have a link and a lan to the same node, you get two aliases.

* Add web interface for generating simple hardwired topologies. ie:
  "Just give me some nodes in a lan."

> From: Jay Lepreau <lepreau@fast.cs.utah.edu>
> To: stoller@fast.cs.utah.edu
> Subject: For the 'todo' list -- project and user "archiving"
> Date: Tue, 11 Jun 2002 20:49:44 -0600 (MDT)
> 
> I don't want to delete projects and probably not users (typically).
> I want to "retire" them, or move them to "alumni"/archive status.  I
> want to keep them in the DB for analysis and statistics reasons, but
> not keep them around forever as active because they clutter things
> up.
> 
> Also, the users might be reactivated if they start a new project.
> 
> These take a little thought because of name space issues, at least.
> Maybe more issues (eg how does an inactive user get reactivated
> unless their password is still valid?).
> 
> ANyway, just something for the list, and keep this message in the
> details part.