1. 08 May, 2002 1 commit
  2. 06 May, 2002 1 commit
  3. 02 May, 2002 1 commit
  4. 30 Apr, 2002 1 commit
    • Robert Ricci's avatar
      Added interswitch bandwidth tracking to ptopgen. This feature looks at · 0fc1f7a2
      Robert Ricci authored
      the vlans table to determine how much of the trunk bandwidth is
      currently 'reserved', and subtracts that from the trunk bandwidth
      reported in the output.
      
      This feature is disabled by default. However, you can enable it by
      putting the line
      TRACK_INTERSWITCH_BANDWIDTH=1
      in your defs file.
      0fc1f7a2
  5. 21 Apr, 2002 1 commit
  6. 04 Apr, 2002 1 commit
    • Leigh B. Stoller's avatar
      First round of ssl'ification of tmcd/tmcc. This needs to be looked at · ffe40d2e
      Leigh B. Stoller authored
      by smarter brains by me (I have asked Dave to look it over). Anyway ...
      
      I added a top level ssl directory which has a bunch of goo for
      creating certificates and keys.  I currently create a Certificate
      Authority, a server certificate, and a client certificate. The private
      keys for all three are unencrypted, so no password is required. All
      key/cert combos can be installed on boss. The client side needs the
      key/cert pair (in one file), and the CA cert (no key!). There are
      install targets to do this. NOTE, you do not want to create/install
      these without being careful, since you could instantly invalidate all
      the clients!
      
      I have added the necessary SSL routines to tmcd/tmcc. See the ssl.c
      and ssl.h file. I have set it up so that with all you need to do is
      uncomment three lines in the makefile, and accept,connect,read,write,
      and close are redirected to SSL'ified versions in ssl.c. The current
      security model is that the client and server both "demand" certificate
      verification from the other side (as opposed to just server side
      verification). tmcd reads in server.pem, while tmcc reads in
      client.pem. Both read in the emulab.pem (CA cert with no private
      key).
      
      Initial testing indicates I have done this at least partially
      correctly. Whoever invented this stuff has a really twisted mind
      though. There are some questions at the top of ssl.c that need to be
      answered.
      
      Oh, also redid all the syslog stuff throughout tmcd.
      ffe40d2e
  7. 02 Apr, 2002 1 commit
    • Leigh B. Stoller's avatar
      Ah, the things I do. Added web page and backend script to scroll the · 07323144
      Leigh B. Stoller authored
      experiment log file to the user as it gets generated. The web page
      does not redraw, it just never exits until the backend sees that the
      experiement transition is done, and then it exists, which terminates
      the script. I added a DB field to hold the logfile name and some
      routines in libdb, with the idea that this might be more generally
      useful at some point. Next time you create an experiment, look for the
      last sentence, and click on "realtime".
      07323144
  8. 01 Apr, 2002 3 commits
    • Robert Ricci's avatar
      New perl event system functions: EventSend{,Warn,Fatal}() These · e58adf16
      Robert Ricci authored
      basically work like the libdb.pm functions of the same name (and in
      fact much of the code was stolen from there.)
      
      Provides a simple single function call to send events. Intended for
      use in scripts whose primary purpose is _not_ to interface with the
      event system, like power and node_reboot. If more control/efficiency
      is required (for example, these functions reconnect to the event
      system every time they're called) , it's better to use the C-like API.
      
      Example call:
      EventSendFatal(objtype   => "TBEXAMPLE",
                     eventtype => $ARGV[0],
                     host      => "*" );
      e58adf16
    • Robert Ricci's avatar
      stated now gets intstalled in @prefix@/sbin · aa2bd0a2
      Robert Ricci authored
      aa2bd0a2
    • Leigh B. Stoller's avatar
      First cut at supporting RON (or more generally, remote nodes). · bd587829
      Leigh B. Stoller authored
      * tmcd/ron: A new directory of client code, based on the freebsd
        client code, but scaled back to the bare minimum. Does only account
        and group file maintenance. I redid the account stuff so that only
        emulab accounts are operated on. Does not require a stub file, but
        instead keeps a couple of local dbm files recording what groups and
        accounts were added by Emulab. There is a ton of paranoia checking
        to make sure that local accounts are not touched.
      
        The update script that runs on the client node detaches so that the
        ssh from boss returns immediately. update can also be run from the
        node periodically and at boottime. The script is installed setuid
        root, but checks to make sure that *only* root or "emulabman" has
        invoked it.
      
      * utils/sshremote: New file. For remote nodes, instead of using sshtb,
        use sshremote, which ssh's in as "emulabman", which needs to be a
        local non-root user, but with an authorized_keys file containing
        boss' public key.
      
      * web interface changes: Allow user to specify his own public key in
        addition to the emulab key.
      
        Add option in showexp page to update accounts on nodes in the
        experiment. I was originally intending to do this from approveuser,
        but this was easier and faster. I will add an option to do it on the
        approveuser page later.
      
      * libdb.pm: Add a TBIsNodeRemote() query to see if a node is in the
        local testbed or a pcRemote node. Currently, this test is hardwired
        to a check for class=pcRemote, but this will need to change to a
        node_types property at some point.
      
      * node_update: Reorg so that there is a maximum number of children
        created. Previously, a child was forked for each node, but that
        could chew up too many processes, especially for remote nodes which
        might hang up. For the same reason, we need to "lock" the experiment
        so that it cannot be terminated while a node_update is in progress.
        Might be to relax that, but this was easy for now. Also add
        distinction between local and remote, since for remote we use
        sshremote insted of sshtb. Various cleanup stuff
      
      * mkacct; When generating a new account, include user supplied pub key
        in the authorized keys file, in addition to the eumlab generated
        key. Both keys are stored in the DB in the users table. Anytime we
        update an account, get a fresh copy of the emulab pub key, in case
        user changes it.
      bd587829
  9. 28 Mar, 2002 1 commit
    • Robert Ricci's avatar
      New script: stated · 447bb8a5
      Robert Ricci authored
      Watches for events sent by TMCD regarding the state of nodes. Records
      this information in the database. Also watches for nodes that undergo
      invalid state transitions, or stay in the same state for too long.
      Right now, the only action it takes is to send email, but in the
      future, will take action to 'unstick' nodes.
      
      Not yet installed by default.
      447bb8a5
  10. 25 Mar, 2002 1 commit
  11. 22 Mar, 2002 1 commit
    • Leigh B. Stoller's avatar
      New "program agent" that runs on the client nodes (freebsd and linux) · 187a3a18
      Leigh B. Stoller authored
      and reponds to PROGRAM events. Currently, just start and stop. Start
      takes a COMMAND= argument, and allows arbitrary command lines since I
      pass the whole thing off to the shell. Caveat; the agent runs as root
      and starts the program as root. You can has as many program objects in
      your NS file as you like, but each one can be started once; you have
      to either stop or wait for the old one to finish before trying to
      start again.
      187a3a18
  12. 11 Mar, 2002 1 commit
    • Leigh B. Stoller's avatar
      Rename exports_setup.proxy and console_setup.proxy to .in versions and · 589d4872
      Leigh B. Stoller authored
      remove the originals, so that we can run the files through configure.
      
      NOTE: I wanted to keep the RCS history intact so I went over to the
      CVS directory on moab and copied the ,v file to the new names, and
      then did a normal cvs remove the originals. This keeps the RCS history
      going without screwing up anyone. Not a recommended approach, but what
      the hell.
      589d4872
  13. 07 Mar, 2002 1 commit
  14. 05 Mar, 2002 1 commit
  15. 04 Mar, 2002 1 commit
    • Robert Ricci's avatar
      New script: schemacheck - Checks to see if the currently-running database · e42f812d
      Robert Ricci authored
      matches the one in the checked-out source.
      
      This now gets called as part of the 'boss-install' target, to guard
      against installing software that is out-of-sync with the running
      database. It is skipped if @prefix@ is not /usr/testbed, to avoid
      getting in the way of development.
      
      If you want to bypass this check, use the 'boss-install-force' target.
      Use of this, however, is not recommended.
      e42f812d
  16. 01 Mar, 2002 1 commit
  17. 27 Feb, 2002 4 commits
  18. 24 Feb, 2002 1 commit
  19. 21 Feb, 2002 1 commit
    • Leigh B. Stoller's avatar
      Some whacking of the event system. I have implemented the addressing · 8305021f
      Leigh B. Stoller authored
      scheme that we discussed in email. Notifications and subscriptions now
      take an "address_tuple" argument (I know, crappy name) that is a
      structure that looks like this:
      
      	char		*site;		/* Which Emulab site. God only */
      	char		*expt;		/* Project and experiment IDs */
      	char		*group;		/* User defined group of nodes */
      	char		*host;		/* A specific host */
      	char		*objtype;	/* LINK, TRAFGEN, etc ... */
              char		*objname;	/* link0, cbr0, cbr1, etc ... */
              char		*eventtype;	/* START, STOP, UP, DOWN, etc ... */
      
      These can be a specific value, ADDRESSTUPLE_ANY if you are a
      subscriber, or ADDRESSTUPLE_ALL if you are a producer. The reason for
      the distinction is that you can optimize the match expression with the
      extra bit of information, and the above structure can make for a
      fairly lengthy match expression, which takes more time of course.
      You should use address_tuple_alloc() and address_tuple_free() rather
      than allocating them yourself. Note that host above is actually the
      ipaddr of control interface. This turns out to be more convenient
      since free nodes do not have virtual names.
      
      Also added a new tbgen directly. This directory includes 3 programs in
      the making:
      
      tbmevd: Is the Testbed Master Event Daemon, to be run on boss and will
      handle TBCONTROL events (reboot, reload, etc). It is just a shell of a
      program right now, that takes the events but does not do anything
      useful with them. Have not defined what the events are, and what DB
      state will be modified.
      
      tbmevc: Is the Testbed Master Event Client (akin to tmcc). It
      generates TBCONTROL events which the tbmevd will pick up and do
      something useful with. This program is intended to be wrapped by a
      perl script that will ask the tmcd for the name of the boss (running
      the event daemon).
      
      sample-client: This is a little client to demonstrate how to connect
      to the event system and use the address tuple to subscribe to events,
      and then how to get information out of notifications.
      
      Note that I have not created a proper build environment yet, so new
      programs should probably go in the event dir for now, and link using
      the same approach as in tbgen/GNUmakefile.in.
      8305021f
  20. 19 Feb, 2002 2 commits
  21. 14 Feb, 2002 1 commit
    • Leigh B. Stoller's avatar
      Respond to Shashi's message that users can cause the parser to go into · e45c4905
      Leigh B. Stoller authored
      an infinite loop rather easily via the NS file TCL hooks. Added a
      perl wrapper around parse.tcl called parse-ns, which forks a child to
      run the parser. The parser is invoked "nice +10" and the CPU limit for
      the child is set to 60 seconds, which should be enough for any parse.
      If the limit is exceeded, send email to tbops since this indicates a
      big problem or a user being dumb/malicious.
      e45c4905
  22. 12 Feb, 2002 1 commit
  23. 11 Feb, 2002 1 commit
  24. 08 Feb, 2002 2 commits
    • Leigh B. Stoller's avatar
      Kill of savevlans since its simply a snapshot of DB state, and not · c8c2b569
      Leigh B. Stoller authored
      very useful by itself anyway.
      c8c2b569
    • Leigh B. Stoller's avatar
      Big round of image/osid changes. This is the first cut (final cut?) at · a73e627e
      Leigh B. Stoller authored
      supporting autocreating and autoloading images. The imageid form now
      sports a field to specify a nodeid to create the image from; If set,
      the backend create_image script is invoked. Thats the easy part.
      Slightly harder is autoloading images based on the osid specified in
      the NS file. To support this, I have added a new DB table called
      osidtoimageid, which holds the mapping from osid/pctype to imageid.
      When users create images, they must specify what node types that image
      is good for. Obviously, the mappings have to be unique or it would be
      impossible to figure it out! Anyway, once that image mapping is
      in place and the image created, the user can specify that ID in the NS
      file. I've changed os_setup to to look for IDs that are not loaded,
      and to try and find one in the osidtoimageid. If found, it invokes
      os_load. To keep things running in parallel as much as possible,
      os_setup issues all the loads/reboots (could be more than a single set
      of loads is multiple IDs are in the NS file) at once, and waits for
      all the children to exit. I've hacked up os_load a bit to try and be
      more robust in the face of PXE failures, which still happen and are
      rather troublsesome. Need an event system!
      
      Contained in this revision are unrelated changed to make the OS and
      Image IDs per-project unique instead of globally unique, since thats a
      pain for the users. This turns out to be very messy, since underneath
      we do not want to pass around pid/ID in all the various places its
      used. Rather, I create a globally unique name and extened the OS and
      Image tables to include pid/name/ID. The user selects pid/name, and I
      create the globally unique ID. For the most part this is invisible
      throughout the system, except where we interface with the user, say in
      the web pages; the user should see his chosen name where possible, and
      the should invoke scripts (os_load, create_image, etc) using his/her
      name not the internal ID. Also, in the front end the NS file should
      use the user name not the ID. All in all, this accounted for a number
      of annoying changes and some special cases that are unavoidable.
      a73e627e
  25. 29 Jan, 2002 1 commit
    • Robert Ricci's avatar
      New script: interswitch · da928f5a
      Robert Ricci authored
      A simple little script to find links/lans that cross between switches,
      and print them out (including which switches they use, and how many
      members they have on each switch.)
      da928f5a
  26. 24 Jan, 2002 1 commit
    • Robert Ricci's avatar
      New script: dbcheck . Beginngs of a database consistency checker. · 441dfb4a
      Robert Ricci authored
      Right now, it loads foreign key information from the foreign_keys
      table of the database, and prints out information on rows that fail
      the consistency checks.
      
      The plan is that it will eventually check more things, such as the
      existence of files references in the database.
      441dfb4a
  27. 18 Jan, 2002 3 commits
  28. 09 Jan, 2002 1 commit
  29. 08 Jan, 2002 1 commit
  30. 07 Jan, 2002 1 commit
    • Leigh B. Stoller's avatar
      Checkpoint first working version of Frisbee Redux. This version · 86efdd9e
      Leigh B. Stoller authored
      requires the linux threads package to give us kernel level pthreads.
      
      From: Leigh Stoller <stoller@fast.cs.utah.edu>
      To: Testbed Operations <testbed-ops@fast.cs.utah.edu>
      Cc: Jay Lepreau <lepreau@cs.utah.edu>
      Subject: Frisbee Redux
      Date: Mon, 7 Jan 2002 12:03:56 -0800
      
      Server:
      The server is multithreaded. One thread takes in requests from the
      clients, and adds the request to a work queue. The other thread processes
      the work queue in fifo order, spitting out the desrired block ranges. A
      request is a chunk/block/blockcount tuple, and most of the time the clients
      are requesting complete 1MB chunks. The exception of course is when
      individual blocks are lost, in which case the clients request just those
      subranges.  The server it totally asynchronous; It maintains a list of who
      is "connected", but thats just to make sure we can time the server out
      after a suitable inactive time. The server really only cares about the work
      queue; As long as the queue si non empty, it spits out data.
      
      Client:
      The client is also multithreaded. One thread receives data packets and
      stuffs them in a chunkbuffer data structure. This thread also request more
      data, either to complete chunks with missing blocks, or to request new
      chunks. Each client can read ahead up 2 chunks, although with multiple
      clients it might actually be much further ahead as it also receives chunks
      that other clients requested. I set the number of chunk buffers to 16,
      although this is probably unnecessary as I will explain below. The other
      thread waits for chunkbuffers to be marked complete, and then invokes the
      imagunzip code on that chunk. Meanwhile, the other thread is busily getting
      more data and requesting/reading ahread, so that by the time the unzip is
      done, there is another chunk to unzip. In practice, the main thread never
      goes idle after the first chunk is received; there is always a ready chunk
      for it. Perfect overlap of I/O! In order to prevent the clients from
      getting overly synchronized (and causing all the clients to wait until the
      last client is done!), each client randomizes it block request order. This
      why we can retain the original frisbee name; clients end up catching random
      blocks flung out from the server until it has all the blocks.
      
      Performance:
      The single node speed is about 180 seconds for our current full image.
      Frisbee V1 compares at about 210 seconds. The two node speed was 181 and
      174 seconds. The amount of CPU used for the two node run ranged from 1% to
      4%, typically averaging about 2% while I watched it with "top."
      
      The main problem on the server side is how to keep boss (1GHZ with a Gbit
      ethernet) from spitting out packets so fast that 1/2 of them get dropped. I
      eventually settled on a static 1ms delay every 64K of packets sent. Nothing
      to be proud of, but it works.
      
      As mentioned above, the number of chunk buffers is 16, although only a few
      of them are used in practice. The reason is that the network transfer speed
      is perhaps 10 times faster than the decompression and raw device write
      speed. To know for sure, I would have to figure out the per byte transfer
      rate for 350 MBs via network, via the time to decompress and write the
      1.2GB of data to the raw disk. With such a big difference, its only
      necessary to ensure that you stay 1 or 2 chunks ahead, since you can
      request 10 chunks in the time it takes to write one of them.
      86efdd9e
  31. 04 Jan, 2002 1 commit
    • Robert Ricci's avatar
      New script: unixgroups . Pretty simple - just a convenient way to manage the · 469dacdb
      Robert Ricci authored
      unixgroup_membershit table from the command line. Runs the appropriate
      commands to make changes in the 'real world' after the database has been
      updated. From the usage message:
      
      Usage: unixgroups <-h | -p | < <-a | -r> uid gid...> >
      -h            This message
      -p            Print group information
      -a uid gid... Add a user to one (or more) groups
      -r uid gid... Remove a user from one (or more) groups
      469dacdb