1. 08 Feb, 2002 1 commit
    • Leigh B. Stoller's avatar
      Big round of image/osid changes. This is the first cut (final cut?) at · a73e627e
      Leigh B. Stoller authored
      supporting autocreating and autoloading images. The imageid form now
      sports a field to specify a nodeid to create the image from; If set,
      the backend create_image script is invoked. Thats the easy part.
      Slightly harder is autoloading images based on the osid specified in
      the NS file. To support this, I have added a new DB table called
      osidtoimageid, which holds the mapping from osid/pctype to imageid.
      When users create images, they must specify what node types that image
      is good for. Obviously, the mappings have to be unique or it would be
      impossible to figure it out! Anyway, once that image mapping is
      in place and the image created, the user can specify that ID in the NS
      file. I've changed os_setup to to look for IDs that are not loaded,
      and to try and find one in the osidtoimageid. If found, it invokes
      os_load. To keep things running in parallel as much as possible,
      os_setup issues all the loads/reboots (could be more than a single set
      of loads is multiple IDs are in the NS file) at once, and waits for
      all the children to exit. I've hacked up os_load a bit to try and be
      more robust in the face of PXE failures, which still happen and are
      rather troublsesome. Need an event system!
      
      Contained in this revision are unrelated changed to make the OS and
      Image IDs per-project unique instead of globally unique, since thats a
      pain for the users. This turns out to be very messy, since underneath
      we do not want to pass around pid/ID in all the various places its
      used. Rather, I create a globally unique name and extened the OS and
      Image tables to include pid/name/ID. The user selects pid/name, and I
      create the globally unique ID. For the most part this is invisible
      throughout the system, except where we interface with the user, say in
      the web pages; the user should see his chosen name where possible, and
      the should invoke scripts (os_load, create_image, etc) using his/her
      name not the internal ID. Also, in the front end the NS file should
      use the user name not the ID. All in all, this accounted for a number
      of annoying changes and some special cases that are unavoidable.
      a73e627e
  2. 17 Jan, 2002 1 commit
  3. 07 Jan, 2002 1 commit
  4. 05 Dec, 2001 1 commit
    • Leigh B. Stoller's avatar
      Even *more* inventive ways to avoid real work; Add DB table to hold · c884cd89
      Leigh B. Stoller authored
      extra unix groups (unixgroup_membership) for special local users that
      need more groups than just their project membership (ie: flux, wheel,
      etc). In mkacct-ctrl, no longer use the admin bit to determine extra
      groups (which were hardwired in), but get the extra group list from
      the DB. This applies to accounts on boss/users; experimental nodes
      still use the admin bit (via tmcd) to get wheel added to the group
      set. Might be worth doing at some point.
      c884cd89
  5. 30 Nov, 2001 1 commit
  6. 06 Nov, 2001 1 commit
  7. 05 Nov, 2001 1 commit
    • Leigh B. Stoller's avatar
      Changes to node control (web page). Added a backend script to do this · f9cfddd4
      Leigh B. Stoller authored
      stuff so that the web page did not need to do anything except display
      and form processing. Add tbsetup/node_control for backend so that it
      can be called from the command line too. The virt_nodes table is also
      updated (for those values that have virt_nodes equivalents), and this
      mostly implies that changes can be applied only to swapped in
      experiments since we use the reserved table to map pcXXX to its vname
      so that the virt_nodes table can be updated. It is an easy extension
      to allow changes based on the pid/eid/vname, but I do not see a reason
      to support this ability yet. Note usage:
      
          Usage: node_control name=value [name=value ...] node [node ...]
                 node_control -e pid,eid name=value [name=value ...]
                 node_control -l
          For multiword values, use name='word0 ... wordN'
          Use -l to get a list of operational parameters you can change.
          Use -e to change parameters of all nodes in an experiment.
      
          {824} stoller$ /build/testbed/install//bin/node_control -l
            next_boot_osid            - (administrators only)
            startup_command
            bios_version              - (administrators only)
            rpms                      - (multiple options allowed)
            default_boot_cmdline
            default_boot_path
            default_boot_osid
            next_pxe_boot_path        - (administrators only)
            tarfiles                  - (multiple options allowed)
            pxe_boot_path             - (administrators only)
            next_boot_cmdline         - (administrators only)
            deltas                    - (multiple options allowed)
            next_boot_path            - (administrators only)
      f9cfddd4
  8. 30 Oct, 2001 1 commit
  9. 29 Oct, 2001 1 commit
    • Leigh B. Stoller's avatar
      A bunch of lastlogin changes! The user and experiment information · 4658545e
      Leigh B. Stoller authored
      pages now show the lastlogin info that is gathered from sshd syslog
      reporting to users. That info is parsed by security/genlastlog.c, and
      entered into the DB in the nodeuidlastlogin and uidnodelastlogin
      tables. If not obvious from the names, for each user we want the last time
      they logged in anyplace, and for each node we want the last time anyone
      logged into it. The latter is obviously more useful for scheduling
      purposes. All of the various images have new /etc/syslog.conf files,
      and the 6.2 got new sshd_configs (all cvsup'ed with kill -HUP). There
      is an entry in boss:/etc/crontab and users:/etc/syslog.conf. All of
      this is decribed in greater detail in security/genlastlog.c.
      4658545e
  10. 25 Oct, 2001 1 commit
  11. 24 Oct, 2001 1 commit
    • Leigh B. Stoller's avatar
      Largish rework of nfree. Started out that I just wanted to map the · 895a44f6
      Leigh B. Stoller authored
      default OSID from the node_types table, to a specific OSID from the
      partition table on the actual node. This is to avoid setting the boot
      OSID to RHL_STD when the node is released, which causes a boot
      failure. Okay, so I added a library routine to do this (yanked out of
      os_setup where I did the code originally). This would solve most of
      the problems, except where there was no OS loaded that would satisfy
      the mapping, in which case the user must have done an os_load, and now
      that auto schedules a reload. Anyway, seemed like this should work.
      Ha! Mysql locking is downright dumb; all tables used within a lock
      region must be locked. nfree was already locking 9 tables, and in
      order to call out to library routines (which might use anything) I
      would have to lock the world, which is not actually possible anyway.
      Why all this locking in nfree in the first place? The idea is that
      there is a race between releasing the node from reserved, and cleaning
      up all those tables (interfaces, delays, nodes, etc). We don't want to
      free a node, and have it get allocated to another experiment before
      the cleanup is done, since that would mess up the state of the node.
      The solution (albiet a crufty one) was to lock just the reserved table
      (which guards against multiple people trying to nfree the same node at
      the same time) and switch the reservation out of the pid,eid and into
      a holding reservation. This effectively removes the node from the
      users control, but keeps it reserved. Then I unlock the reserved
      table. With that done, I can clean up all those tables without any
      locking, since the node is still reserved. After cleanup, I can either
      delete the reservation, or move it to the next reserve or reload
      reservation if those were pending. No locking is needed at this point
      since single table changes are atomic (and nalloc locks reserved
      anyway). Okay, so now we sit back and see if this was a good idea.
      895a44f6
  12. 22 Oct, 2001 1 commit
    • Leigh B. Stoller's avatar
      Add -e pid,eid option to sched_reload to make it easier to schedule · 6adf504b
      Leigh B. Stoller authored
      reloads for nodes in an experiment.
      Change os_load to schedule a default image reload whenever a mereuser
      loads an image that is not the default image for that node type.
      Add some support stuff in libdb (TBSetSchedReload) and some constant
      definitions for sched_reload and for nodelog.
      6adf504b
  13. 20 Oct, 2001 1 commit
  14. 17 Oct, 2001 1 commit
    • Leigh B. Stoller's avatar
      Rework of the batch experiment code. Unified it with the immediate · 4d420b21
      Leigh B. Stoller authored
      experiment code. No longer uses another table. Rather, the experiment
      record contains a couple of extra fields for the batch system. Also
      combined some of the backend code (no longer a killbatch script).
      Also added scriptable experiments; the batchexp program in the bin
      directory can start an experiment from the command line, and in fact
      is used from the web page for both batch experiments and immediate
      experiments (-i option). All of the DB code that was in the web
      interfaces was moved to batchexp.
      4d420b21
  15. 16 Oct, 2001 1 commit
  16. 11 Oct, 2001 1 commit
  17. 28 Sep, 2001 1 commit
    • Leigh B. Stoller's avatar
      Interface change: · f870a7e9
      Leigh B. Stoller authored
      	Usage: os_load [-s | -w] [-r] [-i <imageid>] <node> [node ...]
              Usage: sched_reload [-f | -p] [-r] [-i <imageid>] <node> [node ...]
      
      The imageid is now an optional argument. After continually forgetting
      what imageid to use, or just plain forgetting the argument, and having
      it try to load imageid pc53 on pcXX, I decided this interface was
      bogus. With now imageid, select the default imageid for each node
      provided. This is actually convenient since you can load multiple
      types of nodes in one shot.
      f870a7e9
  18. 26 Sep, 2001 1 commit
  19. 19 Sep, 2001 1 commit
  20. 17 Sep, 2001 2 commits
  21. 11 Sep, 2001 1 commit
  22. 29 Aug, 2001 1 commit
  23. 28 Aug, 2001 1 commit
    • Leigh B. Stoller's avatar
      Cleanup of the Chris' TB scripts. Cosmetic in principle, but reworked · c874636d
      Leigh B. Stoller authored
      to use the DB library access routines, which also changed in response
      to what the tb scripts needed. Added some functions and mor constants.
      Removed the -nologfile option from all the scripts (startexp and
      endexp too), since there is no reason for these scripts to worry about
      log files. Thats handled in the wrappers. Tested with the testsuite
      and live in my own tree.
      c874636d
  24. 22 Aug, 2001 1 commit
  25. 14 Aug, 2001 1 commit
  26. 01 Aug, 2001 1 commit
    • Leigh B. Stoller's avatar
      An attempt at making image creation an easy/automatic operation. HA! · 27f26d99
      Leigh B. Stoller authored
      This uses the pxe booted freebsd kernel and MFS. In addition, I use
      the standard testbed mechanism of specifying a startup command to
      run, which will do the imagezip to NFS mounted /proj/<pid>/.... The
      controlling script on paper sets up the database, reboots the node,
      and then waits for the startstatus to change. Then it resets the DB
      and reboots the node so that it returns back to its normal OS. The
      format of operation is:
      
      	create_image <node> <imageid> <filename>
      
      Node must be under the user's control of course. The filename must
      reside in the node's project (/proj/<pid>/whatever) since thats the
      directory that is mounted by the testbed config software when the
      machine boots. The imageid already exists in the DB, and is used to
      determine what part of the disk to zip up (say, using the slice option
      to the zipper). Since this operation is rather time consuming, it does
      the usual trick of going to background and sending email status later.
      27f26d99
  27. 25 Jul, 2001 1 commit
  28. 23 Jul, 2001 1 commit
  29. 17 Jul, 2001 1 commit
    • Leigh B. Stoller's avatar
      Some minor changes, plus endless hours of PERL confusion. Anyway, add · d1c90991
      Leigh B. Stoller authored
      a bootstatus field to the nodes table. os_setup sets this to one of
      okay, failed, unknown. This is to be used with the still to be defined
      method of specifying certain nodes that can fail reboot on experiment
      creation. Right now sharks are wired to this, and this information is
      presented in the web page. Its also essential for the batch system,
      which needs to consider nodes that failed to reboot, or else batch
      experiments would never end. Might still need a way for an experiment
      to tell the batch system its done though.
      d1c90991
  30. 13 Jul, 2001 1 commit
  31. 05 Jun, 2001 1 commit
  32. 31 May, 2001 1 commit
  33. 29 May, 2001 1 commit
  34. 25 May, 2001 1 commit
    • Leigh B. Stoller's avatar
      New libdb module. A library of some useful routines that will · 7e4ad150
      Leigh B. Stoller authored
      hopefully get bigger and reduce the amount of typing that we all
      do. I hacked up sched_reload and os_load to use it. Pretty simple to
      start with.
      
      I'm not planning to go much further on this until we sync up with the
      dbtoir branch since it will just create needless branch merge errors.
      7e4ad150