1. 17 Nov, 2005 1 commit
    • Mike Hibler's avatar
      1. Beef up "admin mode" support. · 4ec701e7
      Mike Hibler authored
      * Add libadminmfs.pm with routines for entering/exiting and executing
        commands in, the admin MFS.  Node admin and firewall swapout (see
        below) now use this, the image creation process does not yet.
      
      * Add swapout time hooks for running an admin mode process, likely to
        be used to collect swapout time state.  Currently controlled globally
        by two new sitevars.
      
      * Modified node_admin to use the library and added a "-c <command>"
        option to have nodes go into admin mode and run a command.  I don't
        really expect this to be useful, it was just a testing vehicle for
        the library.
      
      2. Improved the swapout process for firewalled experiments.  Largely
         just generalized what we already did for paniced experiments.
         At swapout, firewalled nodes are:
      
         - powered off
         - set to boot into admin mode and run a disk zapper
         - powered on
      
        The swapout process then waits for all nodes to successfully complete
        disk zapage, at which point the nodes are nfree'ed as usual.  Any
        failure of the above process, marks the experiment as panic'ed (to
        ensure that we are involved in cleanup) and sends mail to testbed-ops
        describing the state of the nodes.
      
      3. Added the aforementioned disk zapper, a little C program in the MFS
         which zeroes out the MBR and partition boot blocks (but not the MBR
         partition table or FS superblocks).  This is added insurance that if
         a node somehow gets diverted after being nfree'd but before getting
         the disk reloaded (e.g., goes to hwdown), that we cannot accidentally
         boot from the disk.  This program gets installed in the admin MFS.
      
      4. Related to firewalls, modified swapin to use the new documented
         "snmpit -N" to get the firewall VLAN number rather than parsing the
         output that was a side-effect of VLAN creation.
      4ec701e7
  2. 04 Nov, 2005 1 commit
  3. 20 Oct, 2005 1 commit
    • Kirk Webb's avatar
      · 5326988f
      Kirk Webb authored
      New node_attributes facility and table.
      
      Auxiliary node attributes, such as service tag #, BIOS version, etc., are
      should now be placed into the node_attributes table.  This can be accomplished
      by either using the node_attributes command line tool, or by using the
      modnodeattributes_form.php3 form (not linked in anywhere yet, but will be
      in a moment).  Attribute names and values are checked for sanity using
      table_regex entries.  Also note that I started with the nodecontrol stuff
      as a template.
      
      The command line tool and web form (which simply calls the command line tool
      to actually do the modifications) can add, delete, and/or remove attributes.
      
      Finally, note that the bios_version column has been moved from the nodes
      table to the node_attributes table.  The Node Information page will show
      the list of current attributes at the bottom of the info table.
      5326988f
  4. 19 Sep, 2005 1 commit
    • Leigh B. Stoller's avatar
      Move all modification of the group_membership table to the backend, · cfba1ac7
      Leigh B. Stoller authored
      into a single new script call modgroups. Usage:
      
      	modgroups [-a pid:gid:trust[,pid:gid:trust]...]
                        [-m pid:gid:trust[,pid:gid:trust]...]
                        [-r pid:gid[,pid:gid]...] user
      
      So, -a to add groups, -r to remove groups, and -m to modify the trust
      value for a member of a group.
      
      The reason for doing this is that previously, we had no idea in the
      backend what group changes actually happened; we just knew what the
      current groups are. This make it hard to add and remove users from
      mailing lists, chat server buddy lists, etc. This is cleaner ...
      cfba1ac7
  5. 14 Sep, 2005 1 commit
    • Mike Hibler's avatar
      Changes related to allowing seperate 'fs' (file server) node. · c53d5827
      Mike Hibler authored
      Entailed new instructions for manual setup as well as integration into
      elabinelab framework.  First, the manual path:
      
      setup.txt, setup-boss.txt, setup-ops.txt and new setup-fs.txt:
          Updated to reflect potential for separate fs node.  The org here
          is a little dicey and could be confusing with ops+fs vs. ops and fs.
          Has not been field tested yet.
      
      */GNUmakefile.in: new fs-install target.
      
      configure, configure.in, defs-*:
          Somewhat unrelated, make min uid/gid to use be a defs setting.
          Also add config of fs-install.in script.
      
      boss-install.in, ops-install.in and new fs-install.in:
          Handle distinct fs node.  If you have one, fs-install is run before
          ops-install.  All scripts rely on the defs file settings of FSNODE
          and USERNODE to determine if the fs node is seperate.
      
      utils/checkquota.in:
          Just return "ok" if quotas are not used (i.e., if defs file FS_WITH_QUOTA
          string is null.
      
      install/ports/emulab-fs:
          Meta port for fs node specific stuff.  Also a patch for the samba port
          Makefile so it doesn't drag in CUPs, etc.  Note that the current samba
          port Makefile has this change, I am just backporting to our version.
      
      Elabinelab specific changes:
      
      elabinelab-withfs.ns:
          NS fragment used in conjunction with
      	tb-elab-in-elab-topology "withfs"
          to setup inner-elab with fs node.
      
      elabinelab.ns:
          The hard work on the boss side.  Recognize seperate-fs config and handle
          running of rc.mkelab on that node.  fs setup happens before ops setup.
      
      rc.mkelab:
          The hard work on the client side.  Recognize FsNode setup as well as
          differentiate ops+fs from ops setup.
      
      Related stuff either not part of the repo or checked in previously:
          emulab-fs package
      c53d5827
  6. 13 Jun, 2005 1 commit
    • Timothy Stack's avatar
      · 5e43a771
      Timothy Stack authored
      Initial checkin of a "repositioning" daemon that moves robots back to
      their pens on swapout.
      
      	* configure, configure.in: Add tbsetup/repos_daemon.
      
      	* db/libdb.pm.in: Add constants for the
      	repositionpending/repositioning experiments.
      
      	* db/nfree.in: When freeing garcias, send them to
      	repositionpending instead of reloadpending.
      
      	* event/sched/event-sched.c: Deal with the rare case of no
      	SIMULATOR object being in the agent list for an experiment.
      
      	* robots/emc/emcd.c, robots/emc/locpiper.in: Fix some typos.
      
      	* robots/rmcd/masterController.h, robots/rmcd/masterController.c,
      	robots/rmcd/obstacles.h, robots/rmcd/obstacles.c: Ignore dynamic
      	obstacles that are far away and remove dynamic obstacles where the
      	robot is inside the natural obstacle area.
      
      	* sql/database-create.sql, sql/database-migrate.txt: Add a
      	reposition_status table that tracks the status of robots that are
      	being moved back to their pens.
      
      	* tbsetup/GNUmakefile.in: Install the repos_daemon script.
      
      	* tbsetup/reload_daemon.in: Move robots to the repositionpending
      	experiment, if they haven't already reached their pen.
      
      	* tbsetup/repos_daemon.in: Daemon that takes care of seeing robots
      	back to their pens after they are freed from an experiment.
      5e43a771
  7. 26 May, 2005 1 commit
  8. 18 Mar, 2005 1 commit
    • Mike Hibler's avatar
      Add whack-on-lan power cycle module. · d9d8a58e
      Mike Hibler authored
      To enable WhOL, you need to add outlets table entries for nodes which are
      whol-enabled.  The power_id encodes the interface on boss to use:
      
      +---------+-----------+--------+----------------+
      | node_id | power_id  | outlet | last_power     |
      +---------+-----------+--------+----------------+
      | pcwf6   | whol-fxp0 |      0 | 20050318152119 |
      +---------+-----------+--------+----------------+
      
      You then need interfaces and wires table entries for that interface on
      boss, so that snmpit works with syntax like "boss:1".  This is probably not
      really needed once the VLAN has been setup.
      
      You need a magic VLAN called "WhOL", I used VLAN 999.  Add it to all the
      switches and trunks.  Put boss's port in it.  It will remain there, enabled,
      forever.
      
      In the interfaces table entry for every interface that supports WhOL,
      you need to set the 'whol' field to 1.
      d9d8a58e
  9. 04 Feb, 2005 1 commit
    • Kirk Webb's avatar
      · 79387881
      Kirk Webb authored
      Support for reloading garcias.
      
      Currently this is a total hack; we simply rsync the stargate as it is freed
      from the experiment.  However, until we unravel os_load dependancies between
      nodes and subnodes (motes and stargates in this case), doing this through
      the appropriate setup channels won't work.
      
      The tbrsync script borrows much of its infrastructure from Rob's tbuisp
      script.
      79387881
  10. 24 Jan, 2005 1 commit
    • Timothy Stack's avatar
      · 3c1a5bad
      Timothy Stack authored
      Robot related stuff: power via e-mail, client-install fixups, checking
      coords against camera boundaries.
      
      	* configure, configure.in: Add tbsetup/power_mail.pm to the list
      	of template files.
      
      	* doc/cross-compiling.txt: More stargate notes.
      
      	* event/sched/rpc.cc: Updates for the addition of the cameras
      	table.
      
      	* robots/GNUmakefile.in, robots/emc/GNUmakefile.in,
      	robots/mtp/GNUmakefile.in, robots/rmcd/GNUmakefile.in,
      	robots/tbsetdest/GNUmakefile.in, robots/vmcd/GNUmakefile.in:
      	client-install fixups.
      
      	* tbsetup/GNUmakefile.in: Add power_mail.pm.
      
      	* tbsetup/os_setup.in: Don't skip reboot of robots anymore.
      
      	* tbsetup/power.in: Add special case for a power_id of "mail",
      	which calls into the power_mail.pm backend.
      
      	* tbsetup/power_mail.pm.in: E-mail backend for power, it sends an
      	e-mail to tbops and waits for the outlets.last_power value to be
      	updated from the power.php3 web page.
      
      	* tbsetup/ns2ir/parse-ns.in: Add the contents of the cameras table
      	to the TBCOMPAT namespace.
      
      	* tbsetup/ns2ir/sim.tcl.in: More checking of "setdest" inputs.
      
      	* tbsetup/ns2ir/topography.tcl: Update the checkdest method to
      	check destination points against the camera list.
      
      	* www/powertime.php3: Webpage used to update the last power time
      	for nodes.
      
      	* www/shownode.php3: Add "Update Power Time" menu button.
      3c1a5bad
  11. 13 Jan, 2005 1 commit
  12. 16 Dec, 2004 3 commits
    • Leigh B. Stoller's avatar
      Nothing special. · 760bf11b
      Leigh B. Stoller authored
      760bf11b
    • Robert Ricci's avatar
      Add support (admins only for now) for restarting the event system via · 1d13cde6
      Robert Ricci authored
      the web interface.
      1d13cde6
    • Leigh B. Stoller's avatar
      The panic button ... · 87dd2e60
      Leigh B. Stoller authored
      * tbsetup/panic.in: New backend script to implement the panic button
        feature. When used, it will cut the severe the connection to the
        firewall node by using snmpit to disable the port. Sets the panic
        bit (and date) in the experiments table, and changes the state of
        the experiment from "active" to "paniced" to ensure that the
        experiment cannot be messed with (swapped out or modified). Sends
        email to tbops when the panic button is pressed.
      
        Used with -r option, reverses the above. State is set back to
        active, the panic bit is cleared, and the port is renabled with
        snmpit.
      
      * tbsetup/tbswap.in: During swapout, a firewalled experiment that has
        been paniced will get a cleaning; The nodes are powered off, then
        the osids for all the nodes are reset (with os_select) so that they
        will boot the MFS, and then the nodes are powered on. Then the
        control network is turned back on, and then I wait for the nodes to
        reboot (this is simply cause we do not record in the DB that a node
        is turned off, and if I do not wait, the reload daemon will end
        hitting the power button again if they do not reboot in time. We can
        fix this later.
      
        I am not planning to apply this to general firewalled experiments
        yet as the power cycling is going to be hard on the nodes, so would
        rather that we at least have a 1/2 baked plan before we do that.
      
      * www/showexp.php3: If experiment is firewalled, show the Panic
        Button, linked to the panic button web script. If the experiment has
        already had the panic button pressed, show a big warning message and
        explain that user must talk to tbops to swap the experiment out.
        Also fiddle with menu options so that the terminate link is gone,
        and the swap link is visible only in admin mode. In other words, only
        an admin person can swap an experiment once it is paniced. And of
        course, an admin person can the backend panic script above with the
        -r option, but thats not something to be done lightly.
      
      * db/libdb.pm.in: Add "paniced" as an experiment state (EXPTSTATE_PANICED).
        Add utility functions: TBExptSetPanicBit(), TBExptGetPanicBit(), and
        TBExptClearPanicBit().
      
      * tbsetup/swapexp.in: Minor state fiddling so that an experiment can
        be swapped while in paniced state, but only when in admin mode. Also
        clear the panic bit when experiment is swapped out.
      
      * www/dbdefs.php3.in: Add "paniced" as an experiment state. Add a
        utility function TBExptFirewall() to see if experiment is firewalled.
      
      * www/panicbutton.php3: New web script to invoke the backend panic
        script mentioned above, after the usual confirm song and dance.
      
      * www/panicbutton.gif: New gif of a red panic button that I stole off
        the net. If anyone has sees/has a better one, feel free to replace
        this one.
      
      * utils/node_statewait.in: Add -s option so that I can pass in the
        state I want to wait for (used from tbswap above to wait for nodes
        to reach ISUP after power on).
      87dd2e60
  13. 14 Dec, 2004 1 commit
  14. 16 Nov, 2004 1 commit
    • Leigh B. Stoller's avatar
      ElabInElab Addition: New script that uses the frisbee client to · 6777d279
      Leigh B. Stoller authored
      download images from the outer emulab. This script is invoked from
      frisbeelauncher when ELABINELAB=1 and the filename does not exist
      (thus attempting to get the image file before bailing). The
      frisbeeimage script uses a new method in the RPC server to fire up a
      frisbeed (using frisbeelauncher on the outer Emulab), subject to the
      usual permission checks against creator of the elabinelab experiment
      (I assume that the creator will have access to any outer images that
      are used inside the inner emulab). If outer frisbeelauncher succeeds,
      its return value is the load_address (IP:port), which is used to fire
      up a frisbee client to get the image file and write it out (using
      Mike's new -N option that just dumps the raw data to file). Once the
      image is downloaded, control returns to inner frisbeelauncher and
      proceeds as normal.
      
      I whacked this together pretty quickly. Under heavy usage it might hit
      a race condition or two, but I do not expect that to happen in an
      inner elab for a while.
      6777d279
  15. 15 Nov, 2004 1 commit
    • Leigh B. Stoller's avatar
      A bunch of ElabInELab changes. · 10b116e0
      Leigh B. Stoller authored
      * snmpit: When ElabInELabis true, use the routines in the new
        snmpit_remote.pm library for setting up and tearing down vlans for an
        experiment. At present, only these two operations are proxied out to
        the outer emulab.
      
      * snmpit_remote.pm: A new little library that uses the XMLPRC server on
        the outer emulab to setup and destroy vlans for an inner experiment.
        This code is used from snmpit (see above).
      
      * snmpit_lib.pm: A couple of minor changes for the server side of the
        proxy operation.
      
      * snmpit.proxy.in: A new perl module that is invoked from the RPC
        server.  This proxy sets up and tears down vlans for an inner elab.
        The basic model is that the container experiment will have lots of
        vlans for various individual experiments running on the inner emulab.
      
      * swapexp: A couple of minor elabinelab hacks.
      
      * tbswap: For elabinelab experiments, reconfig/restart dhcpd when
        tearing down the experiment, and call out to new elabinelab script
        when setting up an elabinelab experiment. There is no provision for
        swapmod at this time.
      
      * elabinelab: A new script to create the inner emulab. Does all kinds of
        gross DB stuff then more gross stuff on the inner ops and boss.
      10b116e0
  16. 12 Nov, 2004 1 commit
  17. 08 Oct, 2004 1 commit
  18. 30 Aug, 2004 1 commit
    • Leigh B. Stoller's avatar
      The bulk of the event system changes. · 9aa6b5ca
      Leigh B. Stoller authored
      * The per-experiment event scheduler now runs on ops instead of boss.
        Boss still runs elvind and uses events internally, but the user part
        of the event system has moved.
      
      * Part of the guts of eventsys_control moved to new script, eventsys.proxy,
        which runs on ops and fires off the event scheduler. The only tricky part
        of this is that the scheduler runs as the user, but killing it has to be
        done as root since a different person might swap out the experiment. So,
        the proxy is a perl wrapper invoked from a root ssh from boss, which
        forks, writes the pid file into /var/run/emulab/evsched/$pid_$eid.pid,
        then flips to the user and execs the event scheduler (which is careful
        not to fork). Obviously, if the kill is done as root, the pid file has to
        be stored someplace the user is not allowed to write.
      
      * The event scheduler has been rewritten to use Tim's C++ interface to the
        sshxmlrpc server on boss. Actually, I reorg'ed the scheduler so that it
        can be built either as a mysql client, or as RPC client. Note that it can
        also be built to use the SSL version of the XMLRPC server, but that will
        not go live until I finish the server stuff up. Also some goo for dealing
        with building the scheduler with C++.
      
      * Changes to several makefiles to install the ops binaries over NFS to
        /usr/testbed/opsdir. Makes life easier, but only if boss and ops are
        running the same OS. For now, using static linking on the event scheduler
        until ops upgraded to same rev as boss.
      
      * All of the event clients got little tweaks for dealing with the new CNAME
        for the event system server (event-sever). Will need to build new images
        at some point. Old images and clients will continue to work cause of an
        inetd hack on boss that uses netcat to transparently redirect elvind
        connections to ops.
      
      * Note that eventdebug needs some explaining. In order to make the inetd
        redirect work, elvind cannot be listening on the standard port. So, the
        boss event system uses an alternate port since there are just a few
        subsystems on boss that use the server, and its easy to propogate changes
        on boss. Anyway, the default for eventdebug is to connect to the standard
        port on localhost, which means it will work as expected on ops, but will
        require -b argument on boss.
      
      * Linktest changes were slightly more involved. No longer run linktest on
        boss when called from the experiment swapin path, but ssh over to ops to
        fire it off. This is done as the user of course, and there are some
        tricks to make it possible to kill a running linktest and its ssh when
        experiment swapin is canceled (or from the command line) by forcing
        allocation of a tty. I will probably revisit this at some point, but I
        did not want to spend a bunch of time on linktest.
      
      * The upgrade path detailed in doc/UPDATING is necessarily complicated and
        bound to cause consternation at remote sites doing an upgrade.
      9aa6b5ca
  19. 09 Aug, 2004 1 commit
    • Leigh B. Stoller's avatar
      Major rework of the script interface to Emulab. Up to now we have been · 5ef8f70a
      Leigh B. Stoller authored
      supporting both a shell script driven interface, plus the newer XMLRPC
      interface. This change removes the script driven interface from boss,
      replacing it with just the XMLRPC interface. Since we like to maintain
      backwards compatability with interfaces we have advertised to users (and
      which we know are being used), I have implemented a script wrapper that
      exports the same interface, but which converts the operations into XMLRPC
      requests to the server. This wrapper is written in python and uses our
      locally grown xmlrpc-over-ssh library. Like the current "demonstation"
      client, you can take this wrapper to your machine that has python and ssh
      installed, and use it there; you do not need to use these services from
      just users.emulab.net. Other things to note:
      
      * The wrapper is a single python script that has a "class" for each wrapped
        script. Running the wrapper without any arguments will list all of the
        operations it supports. You can invoke the wrapper with the operation as
        its argument:
      
          {987} stoller$ script_wrapper.py swapexp --help
          swapexp -e pid,eid in|out
          swapexp pid eid in|out
          where:
               -w   - Wait for experiment to finish swapping
               -e   - Project and Experiment ID
               in   - Swap experiment in  (must currently be swapped out)
              out   - Swap experiment out (must currently be swapped in)
      
          Wrapper Options:
              --help      Display this help message
              --server    Set the server hostname
              --login     Set the login id (defaults to $USER)
              --debug     Turn on semi-useful debugging
      
         But more convenient is to create a set of symlinks so that you can just
         invoke the operation by its familiar scriptname. This is what I have
         done on users.emulab.net.
      
          {987} stoller$ /usr/tesbed/bin/swapexp --help
          swapexp -e pid,eid in|out
          swapexp pid eid in|out
      
      
      * For those of you talking directly to the RPC server from python, I have
        added a wrapper class so that you can issue requests to any of the
        modules from a single connection. Instead using /xmlrpc/modulename, you
        can use just /xmlrpc, and use method names of the form experiment.swapexp,
        node.reboot, etc.
      
        Tim this should be useful for the netlab client which I think opens up
        multiple ssh connections?
      
      * I have replaced the paperbag shell with a stripped down xmlrpcbag shell
        that is quite a bit simpler since we no longer allow access to anything
        but the RPC server. No interactive mode, no argument processing, no
        directory changing, etc. My main reason for reworking the bag is to make
        it easier to understand, maintain, and verify that it is secure. The new
        bag also logs all connections to syslog (something we should have done in
        the orginal). I also added some setrlimit calls (core, maxcpu). I also
        thought about niceing the server down, but that would put RPC users at a
        disadvantage relative to web interface users. When we switch the web
        interface to use the XMLRPC backend, we can add this (reniceing from the
        web server would be a pain cause of its scattered implementation).
      5ef8f70a
  20. 28 Jul, 2004 1 commit
  21. 01 Jun, 2004 1 commit
  22. 26 Apr, 2004 1 commit
    • Mike Hibler's avatar
      Cleanup Makefiles: · 297019fb
      Mike Hibler authored
      1. "make clean" will just remove stuff built in the process of a regular build
      2. "make distclean" will also clean out configure generated files.
      
      This is how it was always supposed to be, there was just some bitrot.
      297019fb
  23. 07 Apr, 2004 1 commit
  24. 08 Mar, 2004 1 commit
    • Leigh B. Stoller's avatar
      Converted os_load and node_reboot into libraries. Basically that meant · 9bfe3d61
      Leigh B. Stoller authored
      splitting the existing code between a frontend script that parses arguments
      and does taint checking, and a backend library where all the work is done
      (including permission checks). The interface to the libraries is simple
      right now (didn't want to spend a lot of time on designing interface
      without knowing if the approach would work long term).
      
      	use libreboot;
      	use libosload;
      
              nodereboot(\%reboot_args, \%reboot_results);
              osload(\%reload_args, \%reload_results);
      
      Arguments are passed to the libraries in the form of a hash. For example,
      in os_setup:
      
      	$reload_args{'debug'}     = $dbg;
      	$reload_args{'asyncmode'} = 1;
      	$reload_args{'imageid'}   = $imageid;
      	$reload_args{'nodelist'}  = [ @nodelist ];
      
      Results are passed back both as a return code (-1 means total failure right
      away, while a positive argument indicates the number of nodes that failed),
      and in the results hash which gives the status for each individual node. At
      the moment it is just success or failure (0 or 1), but in the future might
      be something more meaningful.
      
      os_setup can now find out about individual failures, both in reboot and
      reload, and alter how it operates afterwards. The main thing is to not wait
      for nodes that fail to reboot/reload, and to terminate with no retry when
      this happens, since at the moment it indicates an unusual failure, and it
      is better to terminate early. In the past an os_load failure would result
      in a tbswap retry, and another failure (multiple times). I have already
      tested this by trying to load images that have no file on disk; it is nice
      to see those failures caught early and the experiment failure to happen
      much quicker!
      
      A note about "asyncmode" above. In order to promote parallelism in
      os_setup, asyncmode tells the library to fork off a child and return
      immediately. Later, os_setup can block and wait for status by calling
      back into the library:
      
      	my $foo = nodereboot(\%reboot_args, \%reboot_results);
      	nodereboot_wait($foo);
      
      If you are wondering how the child reports individual node status back to
      the parent (so it can fill in the results hash), Perl really is a kitchen
      sink. I create a pipe with Perl's pipe function and then fork a child to so
      the work; the child writes the results to the pipe (status for each node),
      and the parent reads that back later when nodereboot_wait() is called,
      moving the results into the %reboot_results array. The parent meanwhile can
      go on and in the case of os_setup, make more calls to reboot/reload other
      nodes, later calling the wait() routines once all have been initiated.
      Also worth noting that in order to make the libraries "reentrant" I had to
      do some cleaning up and reorganizing of the code. Nothing too major though,
      just removal of lots of global variables. I also did some mild unrelated
      cleanup of code that had been run over once too many times with a tank.
      
      So how did this work out. Well, for os_setup/os_load it works rather
      nicely!
      
      node_reboot is another story. I probably should have left it alone, but
      since I had already climbed the curve on osload, I decided to go ahead and
      do reboot. The problem is that node_reboot needs to run as root (its a
      setuid script), which means it can only be used as a library from something
      that is already setuid. os_setup and os_load runs as the user. However,
      having a consistent library interface and the ability to cleanly figure out
      which individual nodes failed, is a very nice thing.
      
      So I came up with a suitable approach that is hidden in the library. When the
      library is entered without proper privs, it silently execs an instance of
      node_reboot (the setuid script), and then uses the same trick mentioned
      above to read back individual node status. I create the pipe in the parent
      before the exec, and set the no-close-on-exec flag. I pass the fileno along
      in an environment variable, and the library uses that to the write the
      results to, just like above. The result is that os_setup sees the same
      interface for both os_load and node_reboot, without having to worry that
      one or the other needs to be run setuid.
      9bfe3d61
  25. 12 Feb, 2004 1 commit
    • Leigh B. Stoller's avatar
      * Removed startexp, and merged its contents into batchexp. There has been · aef08532
      Leigh B. Stoller authored
        no reason for the separation for a long time, and it made maintence more
        difficult cause of duplication between batchexp and startexp (batch was
        the sole user of startexp). Cleaner solution.
      
      * Check argument processing for batchexp, swapexp, endexp to make sure the
        taint checks are correct. All three of these scripts will now be
        available from ops. I especially watch the filename processing, which was
        pretty loose before and could allow some to grab a file on boss by trying
        to use it as an NS file (scripts all runs as user of course). The web
        interface generates filenames that are hard to guess, so rather then
        wrapping these scripts when invoked from ops, just allow the usual paths
        (/proj, /groups, /users) but also /tmp/$uid-XXXXXX.nsfile pattern, which
        should be hard enough to guess that users will not be able to get
        anything they are not supposed to.
      
      * Add -w (waitmode) options to all three scripts. In waitmode, the backend
        detaches, but the parent remains waiting for the child to finish so it
        can exit with the appropriate status (for scripting). The user can
        interrupt (^C), but it has no effect on the backend; it just kills the
        parent side that is waiting (backend is in a new session ID). Log outout
        still goes to the file (available from web page) and is emailed.
      aef08532
  26. 10 Feb, 2004 1 commit
  27. 02 Feb, 2004 1 commit
  28. 15 Jan, 2004 1 commit
  29. 16 Dec, 2003 2 commits
  30. 15 Dec, 2003 1 commit
    • Shashi Guruprasad's avatar
      Distributed NSE changes. In other words, simulation resources are · d266bd71
      Shashi Guruprasad authored
      now mapped to more than one PC if required. The simnode_capacity
      column in the node_types table determines how many sim nodes can
      be packed on one PC. The packing factor can also be controlled via
      tb-set-colocate-factor to be smaller than simnode_capacity.
      
      - No frontend code changes. To summarize:
        $ns make-simulated {
          ...
        }
        is still the easy way to put a whole bunch of Tcl code to be
        in simulation.
        One unrelated fix in the frontend code is to fix the
        xmlencode() function which prior to this would knock off
        newlines from columns in the XML output. This affected
        nseconfigs since it is one of the few columns with embedded
        newlines. Also changed the event type and event object type
        in traffic.tcl from TRAFGEN/MODIFY to NSE/NSEEVENT.
      
      - More Tcl code in a new directory tbsetup/nseparse
        -> Runs on ops similar to the main parser. This is invoked
           from assign_wrapper in the end if there are simnodes
        -> Partitions the Tcl code into multiple Tcl specifications
           and updates the nseconfigs table via xmlconvert
        -> Comes with a lot of caveats. Arbitrary Tcl code such as user
           specified objects or procedures will not be re-generated. For
           example, if a user wanted a procedure to be included in Tcl
           code for all partitions, there is no way for code in nseparse
           to do that. Besides that, it needs to be tested more thoroughly.
      
      - xmlconvert has a new option -s. When invoked with this option,
        the experiments table is not allowed to be modified. Also,
        virtual tables are just updated (as opposed to deleting
        all rows in the first invocation before inserting new rows)
      
      - nse.patch has all the IP address related changes committed in
        iversion 1.11 + 2 other changes. 1) MTU discovery support in
        the ICMP agent 2) "$ns rlink" mechanism for sim node to real
        node links
      
      - nseinput.tcl includes several client side changes to add IP
        routes in NSE and the kernel routing table for packets crossing
        pnodes. Also made the parsing of tmcc command output more robust
        to new changes. Other client side changes in libsetup.pm and other
        scripts to run nse, are also in this commit
      
      - Besides the expected changes in assign_wrapper for simulated nodes,
        the interfaces and veth_interfaces tables are updated with
        routing table identifiers (rtabid). The tmcd changes are already
        committed. This field is used only by sim hosts on the client side.
        Of course, they can be used by jails as well if desired.
      d266bd71
  31. 02 Dec, 2003 1 commit
  32. 01 Dec, 2003 1 commit
    • Robert Ricci's avatar
      New scripts: tarfiles_setup, fetchtar.proxy, and webtarfiles_setup . · c0c6547c
      Robert Ricci authored
      The idea is to give us hooks for grabbing experimenters' tarballs (and
      RPMs) from locations other than files on ops. Mainly, to remove
      another dependance on users having shells on ops.
      
      tarfiles_setup supports fetching files from http and ftp URLs right
      now, through wget. It places them into the experiment directory, so
      that they'll go away when the experiment is terminated, and the rest
      of the chain (ie. downloading to clients and os_setup's checks)
      remains unchaged.  It is now tarfiles_setup's job to copy tarballs and
      RPMs from the virt_nodes table to the nodes table for allocated nodes.
      This way, it can translate URLs into the local filenames it
      constructs. It get invoked from tbswap.
      
      Does the actual fetching over on ops, running as the user, with
      fetchtar.proxy.
      
      Should be idempotent, so we should be able to give the user a button
      to run webtarfiles_setup (none exists yet) yet to 'freshen' their
      tarballs. (We'd also have to somehow let the experiment's nodes know
      they need to re-fetch their tarballs.)
      
      One funny side effect of this is that the separator in
      virt_nodes.tarfiles is now ';' instead of ':' like nodes.tarballs,
      since we can now put URLs in the former. Making these consistent is a
      project for another day.
      c0c6547c
  33. 18 Nov, 2003 1 commit
  34. 13 Nov, 2003 1 commit
  35. 09 Oct, 2003 1 commit
    • Leigh B. Stoller's avatar
      Reorg of two aspects of node update. · 2641af4d
      Leigh B. Stoller authored
      * install-rpm, install-tarfile, spewrpmtar.php3, spewrpmtar.in: Pumped
        up even more! The db file we store in /var/db now records both the
        timestamp (of the file, or if remote the install time) and the MD5
        of the file that was installed. Locally, we can get this info when
        accessing the file via NFS (copymode on or off). Remote, we use wget
        to get the file, and so pass the timestamp along in the URL request,
        and let spewrpmtar.in determine if the file has changed. If the
        timestamp it gets is >= to the timestamp of the file, an error code
        of 304 (Not Modifed) is returned. Otherwise the file is returned.
      
        If the timestamps are different (remote, server sends back an actual
        file), the MD5 of the file is compared against the value stored. If
        they are equal, update the timestamp in the db file to avoid
        repeated MD5s (or server downloads) in the future. If the MD5 is
        different, then reinstall the tarball or rpm, and update the db file
        with the new timestamp and MD5. Presto, we have auto update capability!
      
        Caveat: I pass along the old MD5 in the URL, but it is currently
        ignored. I do not know if doing the MD5 on the server is a good
        idea, but obviously it is easy to add later. At the moment it
        happens on the node, which means wasted bandwidth when the timestamp
        has changed, but the file has not (probably not something that will
        happen in typical usage).
      
        Caveat: The timestamp used on remote nodes is the time the tarfile
        is installed (GM time of course). We could arrange to return the
        timestamp of the local file back to the node, but that would mean
        complicating the protocol (or using an http header) and I was not in
        the mood for that. In typical usage, I do not think that people will
        be changing tarfiles and rpms so rapidly that this will make a
        difference, but if it does, we can change it.
      
      * node_update.in, client side watchdog, and various web pages:
        Deflated node_update, removing all of the older ssh code. We now
        assume that all nodes will auto update on a periodic basis, via the
        watchdog that runs on all client nodes, including plab nodes.
      
        Changed the permission check to look for new UPDATE permission (used
        to be UPDATEACCOUNT). As before, it requires local_root or better.
        The reason for this is that node_update now implies more than just
        updating the accounts/mounts. The web pages have been changed to
        explain that in addition to mounts/accounts, rpms and tarfiles will
        also be updated. At the moment, this is still tied to a single
        variable (update_accounts) in the nodes table, but as Kirk requested
        at the meeting, it will probably be nice to split these out in the
        future.
      
        Added the ability to node_update a single node in an experiment (in
        addition to all nodes option on the showexp page). This has been
        added to the shownode webpage menu options.
      
        Changed locking code to use the newer wrapper states, and to move
        the experiment to RUNNING_LOCKED until the update completes. This is
        to prevent mayhem in the rest of the system (which could be dealt
        with, but is not worth the trouble; people have to wait until their
        initiated update is complete, before they can swap out the
        experiment).
      
        Added "short" mode to shownode routine, equiv to the recently added
        short mode for showexp. I use this on the confirmation page for
        updating a single node, giving the user a couple of pertinent (feel
        good) facts before they comfirm.
      2641af4d
  36. 25 Sep, 2003 1 commit
  37. 18 Sep, 2003 1 commit