1. 18 Dec, 2002 14 commits
    • Leigh B. Stoller's avatar
      A bunch of little scripts made common and moved from the linux,freebsd · c2ff0bfa
      Leigh B. Stoller authored
      directories into a common subdir. Also cleaned up some.
      c2ff0bfa
    • Leigh B. Stoller's avatar
      The basis of a reorganized (and common) directory structure for the · 9e18f58e
      Leigh B. Stoller authored
      client side code. Two new files, to be included by any client side
      script, which define all the paths. For /bin/sh scripts:
      
      	. /etc/emulab/paths.sh
      
      and for perl scripts:
      
      	BEGIN { require "/etc/emulab/paths.pm"; import emulabpaths; }
      
      Each defines a number of variables: ETCDIR, BINDIR, VARDIR, BOOTDIR,
      LOGDIR, and LOCKDIR. They also set the path properly for the script.
      
      This gets rid of the different hardwired paths, and reduces it to just
      one for everyone.
      9e18f58e
    • Leigh B. Stoller's avatar
      group,master.password: Add sshd, smmsp, mailnull, and sfs. · 77661f58
      Leigh B. Stoller authored
      rc.conf: Remove fixed -p argument. Now set by mkjail.
      rc.local,jailctl: Update for client side path reorg and cleanup.
      jaildog.pl,mkjail.pl: Numerous fixes for jailed nodes.
      77661f58
    • Leigh B. Stoller's avatar
      Bump version number. · dc3fc324
      Leigh B. Stoller authored
      dc3fc324
    • Leigh B. Stoller's avatar
      A doosy! I added two new modes of operation in support of jails. Only · 01234f97
      Leigh B. Stoller authored
      for BSD of course. First is a "proxy" mode that is used outside of a
      jail, to forward tmcc requests from inside the jail to boss over the
      normal ssl channel (when a remote node). We remove the pem files from
      inside the jail so it has no way to form a secure connection to tmcd
      on its own, and tmcd rejects non-ssl connections from remote nodes (it
      should probably reject them from local jails too). Second change is a
      "unix socket" mode that is the compliment to the proxy; tmcc inside of
      a jail connects to the tmcc proxy outside the jail via a unix domain
      socket that can be shared between the two because the outer
      environment can see inside the jailed filesystems (the jail sees a
      chroot environment). When the jail is started, the initial root shell
      gets an environment variable called TMCCUNIXPATH which holds the path
      to the socket. This makes it easy for anything started from that shell
      of course, but its still a minor pain when invoking tmcc from
      elsehwere, but that does not really happen, except when running it by
      hand. Anyway, tmcc forms a unix socket to the proxy and does its
      thing. The proxy filters out VNODE= and PRIVKEY= arguments, and
      inserts its own into the command string.  This prevents a jail from
      trying to impersonate another vnode.
      01234f97
    • Leigh B. Stoller's avatar
      Ignore isalive from local nodes. The new image will run a watchdog · 678a5a34
      Leigh B. Stoller authored
      like the remote nodes do, but for now do not update the up/down status
      from that. I need to mess with db/node_status first to make sure there
      is agreement between the parties. Note that remote nodes send one UDP
      message every 60 seconds (isalive is done with a UDP). Local nodes
      will send them at a slower rate, as is the practice in db/node_status
      which wakes up every 5 minutes and fpings the world!
      678a5a34
    • Leigh B. Stoller's avatar
      Fix -Wall warnings. · 96f3d827
      Leigh B. Stoller authored
      96f3d827
    • Leigh B. Stoller's avatar
      Add client/server build targets after the millonth time I got an error · f568dc59
      Leigh B. Stoller authored
      message about no libmysql on a testbed node.
      f568dc59
    • Leigh B. Stoller's avatar
      Add some no-cache pragmas when we spit out the logfile. I was having · 59621100
      Leigh B. Stoller authored
      some trouble with old logs getting cached in the browser.
      59621100
    • Leigh B. Stoller's avatar
      New "restart" or perhaps better if named "replay" mode to swapexp. · d651dd42
      Leigh B. Stoller authored
      Attempts to replay an experiment by rebooting all the nodes, clearing
      the various startup bits (ready, startstatus, bootstatus, portstats),
      and then restarting the event system. I am dubious that this is a
      workable solution because of the asynchronous nature of the testbed
      (nodes happily cruise from TBRESET to ISUP and beyond without
      stopping), and so its hard to truly replicate the initial lack of
      state that a freshly swapped in experiment has. Still, people
      requested it and I cheerfully provided it cause thats what I do;
      service with a smile and not a wit of complaint. Is anyone reading
      this?
      d651dd42
    • Leigh B. Stoller's avatar
      Two new routines. TBNodeBootReset() resets the startup state for a · b485a466
      Leigh B. Stoller authored
      node. Used in new tbrestart code for replaying experiments. It remains
      to be seen if this is a workable approach.
      
      TBNodeStateWait() is really WaitTillAlive, which I need in several new
      spots now. Its not as general purpose as it seems though, since there
      are only a couple of terminal states (isup) that you can actually wait
      for by querying the DB. But, I'm loathe to add any more event code to
      the system.
      b485a466
    • Leigh B. Stoller's avatar
      Minor change to includevirt option. Instead of "[includevirt]", the · a07c3c83
      Leigh B. Stoller authored
      option is now "[[includevirt] or [virtonly[=<phys>]]]". In other
      words, you can ask to include virtual nodes, or you can ask for just
      virtual nodes. Optionally, you can ask for the virtual nodes for a
      specific physical node. I use this from assign_wrapper to map local
      jail nodes.
      a07c3c83
    • Leigh B. Stoller's avatar
      Allow slightly altered tb-fix-node syntax for creating jails on local · d564e0fb
      Leigh B. Stoller authored
      nodes. The second argument can now be an NS node instead of the name
      of a real testbed node. For example:
      
      	tb-set-hardware $node3  pc600
      	tb-set-hardware $nodev1 pcvm600
      	tb-fix-node $nodev1 $node3
      
      So, "fix" $nodev1 to $node3. The intent is that once $node3 is
      allocated by assign to a real testbed node, we can then allocate a
      virtual node on pcXX to $nodev1. I did this primarily to allow for
      easy testing of jails via my NS file, without having to hack assign
      wrapper to deeply. Note there are still hacks in assign_wrapper to
      support this, but they are not extensive.
      
      Also my old usewatunnels stuff I never checked in:
      
      	tb-set-usewatunnels 0/1
      d564e0fb
    • Leigh B. Stoller's avatar
      Must always check return value from DBQueryWarn for a null value · 84f2b79a
      Leigh B. Stoller authored
      before using it!
      84f2b79a
  2. 17 Dec, 2002 5 commits
  3. 16 Dec, 2002 3 commits
    • Mac Newbold's avatar
    • Mac Newbold's avatar
      Decrease the sleep between loops from 2 to 1, and fix a typo. This should · 6bdba92c
      Mac Newbold authored
      help nodes in reload_pending get sucked into reloading faster. If it
      doesn't do enough, we'll need to do more batching of stuff, so we get some
      parallelism in os_load instead of forcing it to serialize by calling
      os_load one node at a time.
      
      I was tempted to nuke all the stuff that was in there from the netdisk
      reload type, but decided not to. It won't be too long (relatively
      speaking) before we have freed, the new "free node manager" that will
      replace/supersede our current reload_daemon anyway.
      6bdba92c
    • Mac Newbold's avatar
      Fix the 1-event-per-second limitations. Poll until I don't get more · a77a1559
      Mac Newbold authored
      events. This may delay handling of other stuff that happens in my main
      loop, but not by too much. To prevent skew, everything (including reload
      frequency) is done strictly by seconds elapsed, not by iterations or
      anything.
      
      I found that even polling for multiple events without sleeping, I could
      only handle a little over 1 per second when I was calling inuse/statetime
      for additional info on every event. Even though this only happens in the
      worst case (every event is wrong), it won't do. So I took that out. I'll
      probably end up adding a faster lookup of the info I need (mostly
      reservation, and what osid it thinks it is running). That change took it
      up to at least 4 per second (as fast as I could send them manually), more
      than 4x our previous performance. So we should be able to keep up now.
      
      Also, add the support for "announcements" to testbed ops when I die and
      such. (Been in a few days, but this is the first commit of it)
      a77a1559
  4. 13 Dec, 2002 1 commit
  5. 12 Dec, 2002 6 commits
  6. 11 Dec, 2002 8 commits
    • Leigh B. Stoller's avatar
      Fix "wonkyness" as reported by Eric ... · a6cc917b
      Leigh B. Stoller authored
      a6cc917b
    • Leigh B. Stoller's avatar
    • Mike Hibler's avatar
      all: · a6a648e9
      Mike Hibler authored
      	add -Wall to CFLAGS and clean up lint
      	update the TODO file
      	explicitly size the header fields (e.g., int32_t not int)
      
      imagezip:
      	Version 2.
      
      	Adds two ints to the header to help track free space.  Each chunk
      	now has a first and last sector number which can describe any free
      	block before or after the data contained in the chunk.  This is
      	needed in order to properly zero all free space when laying down
      	an image.  In practice: the first chunk describes any free space
      	before the first allocated range and any free space after its
      	contained ranges and before the first allocated range in the second
      	chunk.  Every other chunk then describes just free space following
      	itself (since the previous chunk has already described the space
      	before this chunk).  The point being, we only describe each free
      	range once.
      
      	Added "relocation" information.  Relocation entries go in the chunk
      	header along with region descriptors.  This allows us to identify
      	chunks of data which need to be absolute disk blocks instead of
      	offsets from the containing partition.  This is now used for BSD-slice
      	partition tables which contain absolute disk blocks.  We can now
      	create an image in one slice and reload it into another slice.
      
      	Allow zlib compression level 0 (no compression).  This might be
      	useful on machines that have slow CPUs: do just FS-compression and
      	transfer the image elsewhere faster where it could be re-zipped
      	with regular compression.
      
      	Fix goof.  Previously we were not saving any DOS partition with
      	an unrecognized type.  We should be naively compressing it instead.
      	This is what we now do.  We continue to skip partitions of type 0
      	("unused").
      
      	mikeism: add handler for SIGINFO (^T) to report progress of
      	a zip-age.
      
      	Added everybody's favorite "dots" mode for reporting progress.
      
      	Eliminate some excess copies left over from the conversion from
      	write-every-little-piece to buffer-up-a-full-chunk-and-then-write.
      
      	Eliminaged the special case handling of no skips (ranges) in
      	compress_image by creating a single allocated range describing
      	the whole disk/partition in this case.
      
      	For NTFS, make the behavior of calling missing unicode routines
      	be to return an error rather than exit.  These calls happen,
      	but their failing doesn't seem to be fatal.
      
      	Lots of typical mike-pissing on everything else.
      
      imageunzip:
      
      	Modify to handle both V1 and V2 images.
      
      	In slice mode, make sure we don't write past the bounds of
      	the slice.  ES&D if we try.
      
      	Make output to unseekable devices work again (broken when
      	pwrite was added)
      
      	Add debug -F (Frisbee) option to randomize the presentation of
      	chunks to the unzip/write threads.  Used to simulate frisbee.
      
      	Add "-T DOS-type" option to tell imageunzip, when in slice mode,
      	to set the type of the slice in the DOS partition table.
      	This is useful if you are dropping say a BSD filesystem into
      	an unused slice, you don't have to go back later and set this
      	with fdisk.  Considered making this info part of the image
      	itself (recorded by imagezip when creating a slice image),
      	but decided against it.
      
      	writezero takes an off_t for the size, we can be asked to write
      	many gigabytes of zero at the end of a disk.
      
      	Turn off dots mode by default.  Ya wanna see spots?  Ya gotta
      	turn it on!
      
      	Lots of typical mike-pissing on everything else.
      
      imagedump:
      
      	New tool for checking/dumping the structure of an image and
      	reporting stats about it.
      a6a648e9
    • Mike Hibler's avatar
      Server: back to using a condvar since they seem to be fixed. · 2e77122f
      Mike Hibler authored
      Server: make file readsize independent of burstsize (previously
      readsize had to be a divisor of burstsize).  A subtle side-effect
      is that the dynamic burst rate is recalcluated at the conslusion
      of every burst instead of after every readsize count of blocks has
      been sent (less than a burst)  This just seems to be more logical.
      
      Client: add "-T DOS-type" option to tell frisbee, when in slice
      mode, to set the type of the slice in the DOS partition table.
      This is useful if you are dropping say a BSD filesystem into
      an unused slice, you don't have to go back later and set this
      with fdisk.  Considered making this info part of the image
      itself (recorded by imagezip when creating a slice image),
      but decided against it.
      2e77122f
    • Mike Hibler's avatar
      botched the path · a2ccf66a
      Mike Hibler authored
      a2ccf66a
    • Mike Hibler's avatar
      A better strategy for dealing with how to load network device drivers on · 78f2dd14
      Mike Hibler authored
      the different node types
      78f2dd14
    • Mike Hibler's avatar
      Retro shell-script version of setipod. · cd839813
      Mike Hibler authored
      Needed for the frisbee environment, so might as well use it everywhere.
      cd839813
    • Leigh B. Stoller's avatar
      Minor change to n.class clause. Look for class "pc" and "pct". pct is · 6fd445e7
      Leigh B. Stoller authored
      a temporary class for testing new images.
      6fd445e7
  7. 10 Dec, 2002 3 commits
    • Kirk Webb's avatar
      · fc985a64
      Kirk Webb authored
      Modified the timeout logic in create_image to track the image creation progress
      (size) rather than simply waiting a certain amount of time.  Also changed the
      code to report progress at regularly spaced intervals (adjustable), and
      to indicate when the timeout timer has been activated, or halted due to
      progress.  The changes also include an NFS cache slack factor, which makes
      the effective non-progress timeout equal to the sum of the slack time, plus
      the non-progress time (currently 3 + 5 = 8 minutes).
      
      Some changes were made to the error and cleanup logic to help revert the
      state of the DB and node as much as possible (node is not rebooted if the
      DB state cannot first be reverted) prior to exit.
      fc985a64
    • Leigh B. Stoller's avatar
      Minor change to addpubkeys call. · 1ea32e42
      Leigh B. Stoller authored
      1ea32e42
    • Leigh B. Stoller's avatar
      Fix up ssh/openssh links. · ab2b24ca
      Leigh B. Stoller authored
      ab2b24ca