1. 19 Aug, 2004 1 commit
  2. 18 Aug, 2004 1 commit
    • Christopher Alfeld's avatar
      Fix for ALWAYSUP nodes and fix for switches with interface entries. · a7b4249d
      Christopher Alfeld authored
      In detail:
      
      1. Added TBDB_NODESTATE_ALWAYSUP to libdb.pm for representing the ALWAYSUP
      eventstate.
      
      2. Modified free node calculation in ptopgen to include ALWAYSUP nodes.
      
      3. Added code to ptopgen to correctly handle the case of a NULL iface
      column, which happens when switches have interface (as they do in
      Wisconsin), but assign_wrapper expects (null) for their iface rather than
      "".
      a7b4249d
  3. 17 Aug, 2004 4 commits
  4. 16 Aug, 2004 1 commit
  5. 13 Aug, 2004 2 commits
    • Robert Ricci's avatar
      Fix a harmless typo. · e45c4f50
      Robert Ricci authored
      e45c4f50
    • Robert Ricci's avatar
      Use features and desires to make sure that users get a node that · 68307cfd
      Robert Ricci authored
      supports the OS they are asking for. Puts features 'OS-<osid>' on each
      pnode, listing the OSes that the pnode supports, and a desire on each
      vnode with a weight of 1 for the OS the user wants to run. This way,
      assign will not accidentally pick a node (such as a wireless PC) that
      the user's image will not run on.
      
      Note: This makes ptop files much, much larger, and makes assign take
      somewhat longer to run.
      68307cfd
  6. 11 Aug, 2004 1 commit
    • Leigh B. Stoller's avatar
      Add new per-lan table, which currently is just for Mike: · d09d9696
      Leigh B. Stoller authored
      1.269: Add new table to generate a per virt_lan index for use with
             veth vlan tags. This would be so much easier if the virt_lans
             table had been split into virt_lans and virt_lan_members.
             Anyway, this table might someday become the per-lan table, with a
             table of member settings. This would reduce the incredible amount of
             duplicate info in virt_lans!
      
      	CREATE TABLE virt_lan_lans (
      	  pid varchar(12) NOT NULL default '',
      	  eid varchar(32) NOT NULL default '',
      	  idx int(11) NOT NULL auto_increment,
      	  vname varchar(32) NOT NULL default '',
      	  PRIMARY KEY  (pid,eid,idx),
      	  UNIQUE KEY vname (pid,eid,vname)
      	) TYPE=MyISAM;
      
             This arrangement will provide a unique index per virt_lan, within
             each pid,eid. That is, it starts from 1 for each pid,eid. That is
             necessary since the limit is 16 bits, so a global index would
             quickly overflow. The above table is populated with:
      
      	insert into virt_lan_lans (pid, eid, vname)
                  select distinct pid,eid,vname from virt_lans;
      d09d9696
  7. 10 Aug, 2004 3 commits
  8. 09 Aug, 2004 2 commits
    • Leigh B. Stoller's avatar
      Some cleanups and performance improvements: · f604dc33
      Leigh B. Stoller authored
      * Be more selective about what lists are regenerated; we were generating
        way too many lists each time called. When calling from tbswap, use new
        -t option to generate just the active lists. When called from setgroups,
        use -p option to generate lists just for the project. Add update option
        for when user changes email address (and all lists really do need to be
        regenerated).
      
      * Add "diff" processing. Instead of blindly firing each new list over to
        ops with ssh, store a copy of all of the lists in
        /usr/testbed/lists. After we generate the new list, diff it against the
        stored copy. If the same, skip it. Otherwise stash new copy and fire it
        over. This should reduce the wait times by quite a bit since the lists
        rarely change (except for the activity lists of course).
      
      * Add -n (impotent) option for debugging; skips the ssh over to ops.
      
      * Reorg a lot of stuff; it was getting hard to follow.
      f604dc33
    • Leigh B. Stoller's avatar
      Major rework of the script interface to Emulab. Up to now we have been · 5ef8f70a
      Leigh B. Stoller authored
      supporting both a shell script driven interface, plus the newer XMLRPC
      interface. This change removes the script driven interface from boss,
      replacing it with just the XMLRPC interface. Since we like to maintain
      backwards compatability with interfaces we have advertised to users (and
      which we know are being used), I have implemented a script wrapper that
      exports the same interface, but which converts the operations into XMLRPC
      requests to the server. This wrapper is written in python and uses our
      locally grown xmlrpc-over-ssh library. Like the current "demonstation"
      client, you can take this wrapper to your machine that has python and ssh
      installed, and use it there; you do not need to use these services from
      just users.emulab.net. Other things to note:
      
      * The wrapper is a single python script that has a "class" for each wrapped
        script. Running the wrapper without any arguments will list all of the
        operations it supports. You can invoke the wrapper with the operation as
        its argument:
      
          {987} stoller$ script_wrapper.py swapexp --help
          swapexp -e pid,eid in|out
          swapexp pid eid in|out
          where:
               -w   - Wait for experiment to finish swapping
               -e   - Project and Experiment ID
               in   - Swap experiment in  (must currently be swapped out)
              out   - Swap experiment out (must currently be swapped in)
      
          Wrapper Options:
              --help      Display this help message
              --server    Set the server hostname
              --login     Set the login id (defaults to $USER)
              --debug     Turn on semi-useful debugging
      
         But more convenient is to create a set of symlinks so that you can just
         invoke the operation by its familiar scriptname. This is what I have
         done on users.emulab.net.
      
          {987} stoller$ /usr/tesbed/bin/swapexp --help
          swapexp -e pid,eid in|out
          swapexp pid eid in|out
      
      
      * For those of you talking directly to the RPC server from python, I have
        added a wrapper class so that you can issue requests to any of the
        modules from a single connection. Instead using /xmlrpc/modulename, you
        can use just /xmlrpc, and use method names of the form experiment.swapexp,
        node.reboot, etc.
      
        Tim this should be useful for the netlab client which I think opens up
        multiple ssh connections?
      
      * I have replaced the paperbag shell with a stripped down xmlrpcbag shell
        that is quite a bit simpler since we no longer allow access to anything
        but the RPC server. No interactive mode, no argument processing, no
        directory changing, etc. My main reason for reworking the bag is to make
        it easier to understand, maintain, and verify that it is secure. The new
        bag also logs all connections to syslog (something we should have done in
        the orginal). I also added some setrlimit calls (core, maxcpu). I also
        thought about niceing the server down, but that would put RPC users at a
        disadvantage relative to web interface users. When we switch the web
        interface to use the XMLRPC backend, we can add this (reniceing from the
        web server would be a pain cause of its scattered implementation).
      5ef8f70a
  9. 30 Jul, 2004 4 commits
  10. 29 Jul, 2004 5 commits
    • Jonathon Duerig's avatar
    • Jonathon Duerig's avatar
      Added new ratio-cut partitioning scheme and a METIS search-for-ratio-cut · c7498dfa
      Jonathon Duerig authored
      partitioning scheme. They seem to perform about the same, which is not what was expected. Further tests and tweaks may uncover the cause.
      c7498dfa
    • Leigh B. Stoller's avatar
      * Set $libdb::DBQUERY_MAXTRIES to zero; infinite retry. · 66a2c7db
      Leigh B. Stoller authored
      * Change use of TBGetSiteVar to the non-fatal variant to prevent the
        batch daemon from exiting when mysql goes whacky.
      66a2c7db
    • Leigh B. Stoller's avatar
      Two unrelated bug fixes (with some related cleanups and tweaks) · 9f4edbba
      Leigh B. Stoller authored
      * The first involves swapmod. When a swapmod on an active experiment fails,
        tbswap will reswap the experiment back to the original configuration. The
        problem is that it is reswapping it with the *new* virtual state of the
        experiment in the DB. It is not until later when control returns to
        swapexp that the virtual state is restored. This is plainly wrong, and in
        fact was causing the event scheduler grief cause it was starting up,
        reading the the virtual topo, which was different, wrong, and about to be
        blown away.
      
        I reorganized the modify section of swapexp so that virtual state is
        restored only when its a swapmod on a swapped experiment. On an active
        experiment, I moved that code down into tbswap, which will now does all
        of the virtual and physical state retore before it does the reswap back
        to the original experiment. Just for kicks, its also done if tbswap
        decides to swap the experiment cause of a fatal error.
      
        Cleanups: I changed $NoRecover to $CanRecover. My feeble brain cannot
        deal with !$NoRecover. I know, two knots make a wright for most people.
      
        Renderer: I was annoyed by the fact that we rerun the renderer on a
        failed swapmod. The original reason is that the renderer runs in the
        background and so vis_nodes cannot be saved with the rest of the virtual
        state tables cause the renderer might still be running when the user
        fires off the swapmod. Well, the hell with that. We lock the vis_nodes
        table anyway in the renderer during update, so we are certain to get a
        consistent snapshot. We store the renderer pid in the experiments table,
        so if the renderer was running, just fire off another one; mostly this is
        not going to happen. In addition, tbprerun no longer starts a new
        renderer when doing the swapmod; I start the new renderer later after
        swapmod succeeds. I might end up tweaking this a bit depending on what
        people notice as being different.
      
      * Termination changes to batchexp and swapexp: I've rearranged the
        termination code using an END block so that any uncontrolled exit from
        either batchexp or swapexp will go through the cleanup code, and
        hopefully insert a stats record, as well as not leave the experiment in
        some inbetween state. I've set the max DB retry count to zero in both
        cases, which means infinite retry. I've also added SIGTERM handlers to
        both so that again, we can kill a hung batch/swap and have it clean up
        things more or less. Note that END blocks are not caught when a signal
        causes the program to die; you have to catch it and then die() so that
        the END block is executed.
      
        Eventually, we need to clean up the various libraries so that we do not
        use DBQueryFatal(), but rather use DBQueryWarn(), and look for failure.
        Ditto for event system interface.
      9f4edbba
    • Leigh B. Stoller's avatar
      Set $libdb::DBQUERY_MAXTRIES = 0, which means inifinite retry. · 719a65c4
      Leigh B. Stoller authored
      That will show the devil who means business. Right on.
      719a65c4
  11. 28 Jul, 2004 3 commits
    • Leigh B. Stoller's avatar
      Fix merge error in last revision. · 55575967
      Leigh B. Stoller authored
      55575967
    • Leigh B. Stoller's avatar
      Fix rather serious indexing bug that was causing experiment indicies · 95ad01c1
      Leigh B. Stoller authored
      to be reused if the DB is dropped and recreated, since when that
      happens, auto_increment history is lost and it will go back to using
      the latest highest index in the table. Usually not a problem, but
      since we cross index three other tables using the experiment index,
      this causes quite a bit of grief.
      
      So, my solution is to do my own auto_increment using the
      experiment_stats table (locked of course), which we never delete
      entries from without deleting all entries from the other cross
      referenced tables.
      
          DBQueryFatal("select MAX(exptidx) from experiment_stats");
      
      I also added a sanity check to make sure the new index is not
      currently in use in any of the tables. I also cleaned up the
      error path when something goes wrong.
      95ad01c1
    • Leigh B. Stoller's avatar
      Redo savelogs so that it works again. Rather then copying log files via · d92a0489
      Leigh B. Stoller authored
      NFS, convert to using a proxy that runs on ops, which does the copying
      locally.
      d92a0489
  12. 27 Jul, 2004 1 commit
  13. 26 Jul, 2004 1 commit
    • Leigh B. Stoller's avatar
      Okay, lets clear up some confusion when swapmod fails and 1) the · fdac8b89
      Leigh B. Stoller authored
      experiment is swapped or 2) the experiment is completely terminated.
      In these case, lets put explicit swapout/destroy events into
      testbed_stats so that the record is not confused by experiments that
      appear to start when they are still running. This really throws off
      the summary stats web page!
      fdac8b89
  14. 22 Jul, 2004 1 commit
  15. 20 Jul, 2004 1 commit
  16. 19 Jul, 2004 3 commits
    • Leigh B. Stoller's avatar
      If no matching rows, return () instead of None so that it looks like · a7f6662e
      Leigh B. Stoller authored
      a query that succeeded but did not return any matches.
      a7f6662e
    • Kirk Webb's avatar
      · 03921d0e
      Kirk Webb authored
      A bunch of plab interface updates that I've worked on over the last while.
      Most significant is the revamped renewal code that tries to push the leases
      out to the policy defined maximum of two months during each iteration through
      the plabrenewd daemon loop.
      
      - added python lib code to get SiteVars
      - Fixed up comments to reflect current code operation
      - revamped renewal code (again)
        - changed all times to UTC for consistency
        - removed node-level renew invocation in favor of slice-level
          - if backend module requires node-level renewals, it must handle them
            itself in the slice-level function
          - better reporting
      - set admin bit if creating svc slice
        - other updates to ensure admin bit is preserved
      - update rootball handling function naming
      - updated tryXmlRpcCmd() to accept two new sets of strings, and a callback
        function.  The strings represent Faults that either 1) indicate success,
        or 2) indicate failure.  The callback is another optional error handling
        method, allowing the caller to decide how to treat individual faults as
        they see fit.
      - updated the backend module code to take advantage of the new string
        match status identifiers in tryXmlRpcCmd()
      - completely revamped slice renewal code in mod_PLC backend
        - compare against real lease expiration data gathered direct from PLC
        We used to just infer from our originally requested lease length
        - warn when our notion of expiration doesn't match PLC's
      - added agent caching and lease expiration info caching to mod_PLC
        backend.
      03921d0e
    • Robert Ricci's avatar
      Explicitly make the feedback logfile group-writable so that others in · c7f382e7
      Robert Ricci authored
      the project can swap/modify experiments.
      c7f382e7
  17. 15 Jul, 2004 2 commits
    • Leigh B. Stoller's avatar
      Couple of minor tweaks to make sure that experiment state events · d1a35ea9
      Leigh B. Stoller authored
      get sent properly; need to call TBdbfork(), and add a couple more
      event sends in libdb.
      d1a35ea9
    • Leigh B. Stoller's avatar
      Overview: Add Event Groups: · ed964507
      Leigh B. Stoller authored
      	set g1 [new EventGroup $ns]
      	$g1 add  $link0 $link1
      	$ns at 60.0 "$g1 down"
      
      See the new advanced tutorial section on event groups for a better
      example.
      
      Changed tbreport to dump the event groups table when in summary mode.
      At the same time, I changed tbreport to use the recently added
      virt_lans:vnode and ip slots, decprecating virt_nodes:ips in one more
      place. I also changed the web interface to always dump the event and
      event group summaries.
      
      The parser gets a new file (event.tcl), and the "at" method deals with
      event group events by expanding them inline into individual events
      sent to each member. For some agents, this is unavoidable; traffic
      generators get the initial params in the event, so it is not possible
      to send a single event to all members of the group. Same goes for
      program objects, although program objects do default to the initial
      command now, at least on new images.
      
      Changed the event scheduler to load the event groups table. The
      current operation is that the scheduler expands events sent to a
      group, into a set of distinct events sent to each member of the
      group. At some point we proably want to optimize this by telling the
      agents (running on the nodes) what groups they are members of.
      
      Other News: Added a "mustdelay" slot to the virt_lans table so the
      parser can tell assign_wrapper that a link needs to be delayed, say if
      there are events or if the link is red/gred. Previously,
      assign_wrapper tried to figure this out by looking at the event list,
      etc. I have removed that code; see database-migrate for instructions
      on how to initialize this slot in existing experiments. assign_wrapper
      is free to ignore or insert delays anyway, but having the parser do
      this makes more sense.
      
      I also made some "rename" changes to the parser wrt queues and lans
      and links. Not really necessary, but I got sidetracked (for several
      hours!) trying to understand that rename stuff a little better, and
      now I do.
      ed964507
  18. 13 Jul, 2004 4 commits