1. 29 Jan, 2003 2 commits
    • Leigh B. Stoller's avatar
      Add debugging. · 19122fd2
      Leigh B. Stoller authored
      19122fd2
    • Leigh B. Stoller's avatar
      Add package variable $DBQUERY_MAXTRIES, to control query retry in the · 37da2334
      Leigh B. Stoller authored
      case that the mysql server goes away suddenly. Defaults to 1, which
      gives us the behaviour we have now. Used like this after loading libdb:
      
      	$libdb::DBQUERY_MAXTRIES = 5;
      
      The query will be retried this many times. For infinite retry, set it
      to zero! Note that I'm not entirely sure if queries are atomic, so
      beware if you use this.
      
      Also some minor cleanup of $scriptname usage, some of which was
      duplicated in the connect routine.
      37da2334
  2. 15 Jan, 2003 3 commits
  3. 18 Dec, 2002 1 commit
    • Leigh B. Stoller's avatar
      Two new routines. TBNodeBootReset() resets the startup state for a · b485a466
      Leigh B. Stoller authored
      node. Used in new tbrestart code for replaying experiments. It remains
      to be seen if this is a workable approach.
      
      TBNodeStateWait() is really WaitTillAlive, which I need in several new
      spots now. Its not as general purpose as it seems though, since there
      are only a couple of terminal states (isup) that you can actually wait
      for by querying the DB. But, I'm loathe to add any more event code to
      the system.
      b485a466
  4. 06 Dec, 2002 2 commits
  5. 14 Nov, 2002 1 commit
    • Leigh B. Stoller's avatar
      libdb: Add TBIPtoNodeID utility function. Also some minor jail related · bed7c3ee
      Leigh B. Stoller authored
      changes; add optinal jailflag argument to TBIsNodeVirtual, and add some
      constants for assigning port ranges to jailed nodes.
      
      newwanode: Allow reuse of existing node. So, if called with an IP that
      already exists in the DB, reuse those records (nodes, interfaces,
      reserved) rather than creating a new one. The web page makes sure that
      the calling node has a valid IP (REMOTE_ADDR equals IP it gives us),
      so it typically means an exiting node is being reinstalled (happening
      with RON nodes right now!).
      bed7c3ee
  6. 05 Nov, 2002 1 commit
  7. 18 Oct, 2002 1 commit
    • Mac Newbold's avatar
      Merge the newstated branch with the main tree. · 5c961517
      Mac Newbold authored
      Changes to watch out for:
      
      - db calls that change boot info in nodes table are now calls to os_select
      
      - whenever you want to change a node's pxe boot info, or def or next boot
      osids or paths, use os_select.
      
      - when you need to wait for a node to reach some point in the boot process
      (like ISUP), check the state in the database using the lib calls
      
      - Proxydhcp now sends a BOOTING state for each node that it talks to.
      
      - OSs that don't send ISUP will have one generated for them by stated
      either when they ping (if they support ping) or immediately after they get
      to BOOTING.
      
      - States now have timeouts. Actions aren't currently carried out, but they
      will be soon. If you notice problems here, let me know... we're still
      tuning it. (Before all timeouts were set to "none" in the db)
      
      One temporary change:
      
      - While I make our new free node manager daemon (freed), all nodes are
      forced into reloading when they're nfreed and the calls to reset the os
      are disabled (that will move into freed).
      5c961517
  8. 04 Oct, 2002 1 commit
  9. 17 Sep, 2002 1 commit
  10. 16 Sep, 2002 1 commit
    • Leigh B. Stoller's avatar
      Reorg of working directory and log file stuff for start/swap/end · 533dc18f
      Leigh B. Stoller authored
      experiment. Here is mail to tbops:
      
      * Moved the working directory for experiment setup/swap/end to a new
        directory located on boss instead of over NFS to /proj/$pid/$eid. This
        new location is /usr/testbed/expwork/$pid/$eid.
      
      * Changed the name of the directories we create in /usr/testbed/expinfo to
        $pid-$eid.$index where $index is a new autoincrement field in the DB
        table. I really hated the names that were created before.
      
      * Changed where logs are written from /tmp to the new location in
        /usr/testbed/expwork/$pid/$eid.
      
      Okay, why.
      
      * We no longer operate on NFS mounted directories that might hang. Its
        easier to catch the situation where a copy of the log file over at the
        end of experiment creation fails cause of an NFS problem.
      
      * We no longer have user writable files that are inputs to other parts of
        the system (like top and ptop files).  Not that a user would be bad, but
        it closes a hole.
      
      * We no longer copy user writable files from /proj to boss where we might
        fill up an important filesystem cause the user put a .ndz file in the the
        working directory. Not that a user would be bad, but it closes a hole.
      
      * Its easier to save all the log files this way, for each swap in and
        out.
      
      * Removing a directory over NFS is a royal irritant when someone is CD'ed
        into that directory or looking at a file on the other side (the astute
        observer will peg this as the reason I went down this idiotic path in the
        first place!).
      
      * About 6 other reasons that I can no longer remember. Seriously, I really
        had more reasons I can no longer remember! :-)
      533dc18f
  11. 20 Aug, 2002 2 commits
    • Leigh B. Stoller's avatar
      Get rid of that ANNOYING extra 1 second delay that all scripts were · a79c7d34
      Leigh B. Stoller authored
      exhibiting. Sheesh!
      a79c7d34
    • Leigh B. Stoller's avatar
      libdb: Add a couple of support routines to update the account_update · a9909c42
      Leigh B. Stoller authored
      flag in the nodes table when a user changes his info. Two routines,
      one to do it by type (as for widearea nodes) and another to do it by
      project (as for local) nodes. This last is kinda inefficient, but
      probably not too big a deal.
      
      mkacct: Two changes.
      1. Use the above changes in libdb when a user changes his info. With
         this change, no longer need to do an account update in the
         experiment page for the ron/wa nodes. The nodes are marked as
         needing the update in mkacct, based on the nodes the user has
         access to. Note, this change applies only to widearea nodes; still
         need to use the update option in the experiment menu for local
         nodes, although I plan to change that to at some point by adding a
         watchdog on the local nodes.
      
      2. ssh2 key support. The DB can now store both ssh1 and ssh2 keys,
         however those keys are handled differently when creating the auth
         keys files for users. There are actually two files created now, the
         second being the ssh2 key file call authorized_keys2. This change
         is mirrored in the client side code as well.
      a9909c42
  12. 11 Jul, 2002 1 commit
  13. 04 Jul, 2002 1 commit
  14. 16 Jun, 2002 1 commit
  15. 13 Jun, 2002 2 commits
  16. 11 Jun, 2002 3 commits
  17. 10 Jun, 2002 1 commit
  18. 04 Jun, 2002 1 commit
  19. 31 May, 2002 1 commit
  20. 28 May, 2002 1 commit
  21. 06 May, 2002 1 commit
  22. 22 Apr, 2002 1 commit
  23. 17 Apr, 2002 1 commit
    • Robert Ricci's avatar
      Moved EventSend calls to the TBSetNodeEventState() function. This has · 15c13c32
      Robert Ricci authored
      two benefits: (1) More general (2) Regains ability to run without the
      event system. Previously, since programs that watned to set node state
      had to 'use event', this broke our ability to run without the event
      system. Now, we can do a check in libdb for the event system, and not
      use it if EVENTSYS is not set. If not, we update state in the database
      directly rather than sending an event.
      
      Also added equivalent calls for node operational mode, as well as new
      constants for both state and mode.
      
      Converted power and node_reboot to use this new scheme.
      15c13c32
  24. 02 Apr, 2002 1 commit
    • Leigh B. Stoller's avatar
      Ah, the things I do. Added web page and backend script to scroll the · 07323144
      Leigh B. Stoller authored
      experiment log file to the user as it gets generated. The web page
      does not redraw, it just never exits until the backend sees that the
      experiement transition is done, and then it exists, which terminates
      the script. I added a DB field to hold the logfile name and some
      routines in libdb, with the idea that this might be more generally
      useful at some point. Next time you create an experiment, look for the
      last sentence, and click on "realtime".
      07323144
  25. 01 Apr, 2002 2 commits
    • Robert Ricci's avatar
      Fixed some event-system constants · d28662b8
      Robert Ricci authored
      d28662b8
    • Leigh B. Stoller's avatar
      First cut at supporting RON (or more generally, remote nodes). · bd587829
      Leigh B. Stoller authored
      * tmcd/ron: A new directory of client code, based on the freebsd
        client code, but scaled back to the bare minimum. Does only account
        and group file maintenance. I redid the account stuff so that only
        emulab accounts are operated on. Does not require a stub file, but
        instead keeps a couple of local dbm files recording what groups and
        accounts were added by Emulab. There is a ton of paranoia checking
        to make sure that local accounts are not touched.
      
        The update script that runs on the client node detaches so that the
        ssh from boss returns immediately. update can also be run from the
        node periodically and at boottime. The script is installed setuid
        root, but checks to make sure that *only* root or "emulabman" has
        invoked it.
      
      * utils/sshremote: New file. For remote nodes, instead of using sshtb,
        use sshremote, which ssh's in as "emulabman", which needs to be a
        local non-root user, but with an authorized_keys file containing
        boss' public key.
      
      * web interface changes: Allow user to specify his own public key in
        addition to the emulab key.
      
        Add option in showexp page to update accounts on nodes in the
        experiment. I was originally intending to do this from approveuser,
        but this was easier and faster. I will add an option to do it on the
        approveuser page later.
      
      * libdb.pm: Add a TBIsNodeRemote() query to see if a node is in the
        local testbed or a pcRemote node. Currently, this test is hardwired
        to a check for class=pcRemote, but this will need to change to a
        node_types property at some point.
      
      * node_update: Reorg so that there is a maximum number of children
        created. Previously, a child was forked for each node, but that
        could chew up too many processes, especially for remote nodes which
        might hang up. For the same reason, we need to "lock" the experiment
        so that it cannot be terminated while a node_update is in progress.
        Might be to relax that, but this was easy for now. Also add
        distinction between local and remote, since for remote we use
        sshremote insted of sshtb. Various cleanup stuff
      
      * mkacct; When generating a new account, include user supplied pub key
        in the authorized keys file, in addition to the eumlab generated
        key. Both keys are stored in the DB in the users table. Anytime we
        update an account, get a fresh copy of the emulab pub key, in case
        user changes it.
      bd587829
  26. 07 Mar, 2002 1 commit
  27. 05 Mar, 2002 1 commit
    • Leigh B. Stoller's avatar
      Add TBdbfork(), a terribly bogus hack to deal with dropping and · fc2a5b7c
      Leigh B. Stoller authored
      recreating the connection to the DB across a fork. It appears that
      with the connection shared, DB queries can fail. It would be nice if
      PERL had fork handlers.
      
      Add TBSetNodeEventState() and TBGetNodeEventState() library routines,
      and some constants for the event tags.
      
      Beef up the experiment access check code to handle destroy as a
      distinct case.
      fc2a5b7c
  28. 12 Feb, 2002 1 commit
  29. 08 Feb, 2002 1 commit
    • Leigh B. Stoller's avatar
      Big round of image/osid changes. This is the first cut (final cut?) at · a73e627e
      Leigh B. Stoller authored
      supporting autocreating and autoloading images. The imageid form now
      sports a field to specify a nodeid to create the image from; If set,
      the backend create_image script is invoked. Thats the easy part.
      Slightly harder is autoloading images based on the osid specified in
      the NS file. To support this, I have added a new DB table called
      osidtoimageid, which holds the mapping from osid/pctype to imageid.
      When users create images, they must specify what node types that image
      is good for. Obviously, the mappings have to be unique or it would be
      impossible to figure it out! Anyway, once that image mapping is
      in place and the image created, the user can specify that ID in the NS
      file. I've changed os_setup to to look for IDs that are not loaded,
      and to try and find one in the osidtoimageid. If found, it invokes
      os_load. To keep things running in parallel as much as possible,
      os_setup issues all the loads/reboots (could be more than a single set
      of loads is multiple IDs are in the NS file) at once, and waits for
      all the children to exit. I've hacked up os_load a bit to try and be
      more robust in the face of PXE failures, which still happen and are
      rather troublsesome. Need an event system!
      
      Contained in this revision are unrelated changed to make the OS and
      Image IDs per-project unique instead of globally unique, since thats a
      pain for the users. This turns out to be very messy, since underneath
      we do not want to pass around pid/ID in all the various places its
      used. Rather, I create a globally unique name and extened the OS and
      Image tables to include pid/name/ID. The user selects pid/name, and I
      create the globally unique ID. For the most part this is invisible
      throughout the system, except where we interface with the user, say in
      the web pages; the user should see his chosen name where possible, and
      the should invoke scripts (os_load, create_image, etc) using his/her
      name not the internal ID. Also, in the front end the NS file should
      use the user name not the ID. All in all, this accounted for a number
      of annoying changes and some special cases that are unavoidable.
      a73e627e
  30. 17 Jan, 2002 1 commit
  31. 07 Jan, 2002 1 commit