1. 20 Jan, 2005 1 commit
  2. 18 Jan, 2005 1 commit
    • Leigh Stoller's avatar
      Here is a checkpoint of the admission control stuff I have been working on. · 54f55585
      Leigh Stoller authored
      The last part is the stuff to hook it in from assign_wrapper, and some
      additional support in assign that Rob is adding for me. This comment is
      from the top of new file db/libadminctrl.pm.in and describes everything in
      detail.
      
      # Admission control policies. These are the ones I could think of, although
      # not all of these are implemented.
      #
      #  * Number of experiments per type/class (only one expt using robots).
      #
      #  * Number of experiments per project
      #  * Number of experiments per subgroup
      #  * Number of experiments per user
      #
      #  * Number of nodes per project      (nodes really means pc testnodes)
      #  * Number of nodes per subgroup
      #  * Number of nodes per user
      #
      #  * Number of nodes of a class per project
      #  * Number of nodes of a class per group
      #  * Number of nodes of a class per user
      #
      #  * Number of nodes of a type per project
      #  * Number of nodes of a type per group
      #  * Number of nodes of a type per user
      #
      #  * Number of nodes with attribute(s) per project
      #  * Number of nodes with attribute(s) per group
      #  * Number of nodes with attribute(s) per user
      #
      # So we have group (pid/gid) policies and user policies. These are stored
      # into two different tables, group_policies and user_policies, indexed in
      # the obvious manner. Each row of the table defines a count (experiments,
      # nodes, etc) and a type of thing being counted (experiments, nodes, types,
      # classes, etc). When we test for admission, we look for each matching row
      # and test each condition. All conditions must pass. No conditions means a
      # pass. There is also some "auxdata" which holds extra information needed
      # for the policy (say, the type of node being restricted).
      #
      #      uid:     a uid
      #   policy:     'experiments', 'nodes', 'type', 'class', 'attribute'
      #    count:     a number
      #  auxdata:     a string (optional)
      #
      # Example: A user policy of ('mike', 'nodes', 10) says that poor mike is
      # not allowed to have more 10 nodes at a time, while ('mike', 'type',
      # '10', 'pc850') says that mike cannot allocate more than 10 pc850s.
      #
      # The group_policies table:
      #
      #      pid:     a pid
      #      gid:     a gid
      #   policy:     'experiments', 'nodes', 'type', 'class', 'attribute'
      #    count:     a number
      #  auxdata:     a string (optional)
      #
      # Example: A project policy of ('testbed', 'testbed', 'experiments', 10)
      # says that the testbed project may not have more then 10 experiments
      # swapped in at a time, while ('testbed', 'TG1', 'nodes', 10) says that the
      # TG1 subgroup of the testbed project may not use more than 10 nodes at
      # time.
      #
      # In addition to group and user policies (which are policies that apply to
      # specific users/projects/subgroups), we also need policies that apply to
      # all users/projects/subgroups (ie: do not want to specify a particular
      # restriction for every user!). To indicate such a policy, we use a special
      # tag in the tables (for the user or pid/gid):
      #
      #      '+'  -  The policy applies to all users (or project/groups).
      #
      # Example: ('+','experiments',10) says that no user may have more then 10
      # experiments swapped in at a time. The rule overrides anything more
      # specific (say a particular user is restricted to 20 experiments; the above
      # rule overrides that and the user (all users) is restricted to 10.
      #
      # Sometimes, you want one of these special rules to apply to everyone, but
      # *allow* it to be overridden by a more specific rule. For that we use:
      #
      #      '-'  -  The policy applies to all users (or project/groups),
      #              but can be overridden by a more specific rule.
      #
      # Example: The rules:
      #
      #	('-','type',0, 'garcia')
      #       ('testbed', 'testbed', 'type', 10, 'garcia')
      #
      # says that no one is allowed to allocate garcias, unless there is specific
      # rule that allows it; in this case the testbed project can allocate them.
      #
      # There are other global policies we would like to enforce. For example,
      # "only one experiment can be using the robot testbed." Encoding this kind
      # of policy is harder, and leads down a path that can get arbitrarily
      # complex. Tha path leads to ruination, and so we want to avoid it at
      # all costs.
      #
      # Instead we define a simple global policies table that applies to all
      # experiments currently active on the testbed:
      #
      #   policy:     'nodes', 'type', 'class', 'attribute'
      #     test:     'max', others I cannot think of right now ...
      #    count:     a number
      #  auxdata:     a string
      #
      # Example: A global policy of ('nodes', 'max', 10, '') say that the maximum
      # number of nodes that may be allocated across the testbed is 10. Thats not
      # a very realistic policy of course, but ('type', 'max', 1, 'garcia') says
      # that a max of one garcia can be allocated across the testbed, which
      # effectively means only one experiment will be able to use them at once.
      # This is of course very weak, but I want to step back and give it some
      # more thought before I redo this part.
      #
      # Is that clear? Hope so, cause it gets more complicated. Some admission
      # control tests can be done early in the swap phase, before we really do
      # anything (before assign_wrapper). Others (type and class) tests cannot
      # be done here; only assign can figure out how an experiment is going to map
      # to physical nodes (remember virtual types too), and in that case we need
      # to tell assign what the "constraints" are and let it figure out what is
      # possible.
      #
      # So, in addition to the simple checks we can do, we also generate an array
      # to return to assign_wrapper with the maximum counts of each node type and
      # class that is limited by the policies. assign_wrapper will dump those
      # values into the ptop file so that assign can enforce those maximum values
      # regardless of what hardware is actually available to use. As per discussion
      # with Rob, that will look like:
      #
      #	set-type-limit <type> <limit>
      #
      # and assign will spit out a new type of violation that assign_wrapper will
      # parse.
      #
      # NOTES:
      #
      #  1) Admission control is skipped in admin mode; returns okay.
      #  2) Admission control is skipped when the pid is emulab-ops; returns okay.
      #  3) When calculating current usage, nodes reserved to emulab-ops are
      #     ignored.
      #  4) The sitevar "swap/use_admission_control" controls the use of admission
      #     control; defaults to 1 (on).
      #  5) The current policies can be viewed in the web interface. See
      #     https://www.emulab.net/showpolicies.php3
      #  6) The global policy stuff is weak. I plan to step back and think about it
      #     some more before redoing it, but it will tide us over for now.
      #
      54f55585
  3. 14 Jan, 2005 1 commit
  4. 13 Jan, 2005 1 commit
  5. 12 Jan, 2005 3 commits
    • Leigh Stoller's avatar
      Another little hack for Mike; Add a "lockdown" bit to the experiments · d8b17f2c
      Leigh Stoller authored
      table that will prevent an experiment from being swapped/modified. The
      toggle is on the showexp page, and the toggle is *not* admin
      over-ridable; you must turn the toggle off (and of course, you must be
      an admin to do that).
      d8b17f2c
    • Timothy Stack's avatar
      regenerate · 850e3f4c
      Timothy Stack authored
      850e3f4c
    • Timothy Stack's avatar
      · f45f9c16
      Timothy Stack authored
      Fix some robot related stuff that I broke with the last checkin and add in
      some other tweaks.
      
      	* robots/primotion/garcia-pilot.cc,
      	robots/primotion/pilotClient.hh, robots/primotion/pilotClient.cc:
      	Broadcast any goto/stop commands to clients observing the robot.
      
      	* robots/rmcd/rmcd.c: Change the behavior to reorient the robot as
      	the last step in a goto, so we avoid doing unnecessary pivots.
      	Need to send an init packet to the robot so it knows who is
      	talking to it.
      
      	* robots/vmcd/vmc-client.c: Oops, supposed to use M_PI_2, not
      	M_PI, when translating from camera coords to world.
      
      	* www/telemetry.php3: Make the size of the applet a little bigger.
      
      	* www/garcia-telemetry/GarciaTelemetry.java,
      	www/garcia-telemetry/UpdateThread.java,
      	www/garcia-telemetry/main.xml: Display a log of goto/stop commands
      	sent to the robot.
      f45f9c16
  6. 11 Jan, 2005 1 commit
    • Leigh Stoller's avatar
      Add "obstacles" for robots to avoid. · eda7add2
      Leigh Stoller authored
      * New database table to store obstacles, in the usual coord system;
        x1,y1, is the upper left corner.
      
      * New web page to dump the entire obstacle list
      
      	https://www.emulab.net/obstacle_list.php3
      
      * New web page to dump a single obstacle, referenced by the above list
        page, and by the floormap generator.
      
      * Hack up the floormap code to add obstacles to the areamap, so that
        when you mouse over them, you get a ballon showing the description,
        and a link to the above mentioned page.
      eda7add2
  7. 10 Jan, 2005 4 commits
    • Leigh Stoller's avatar
    • Leigh Stoller's avatar
      A quick hack job to get the webcams onto the web interface. · d46902e1
      Leigh Stoller authored
      * Add new DB table "webcams" which hold the id of the webcam, the
        server it is attached to, and the last update time.
      
      * Add new sitevars webcam/anyone_can_view and webcam/admins_can_view.
        Should be obvious what they mean.
      
      * Add trivial script grabwebcams (invoked from cron) to grab the images
        from the servers and stash in /usr/testbed/webcams. The images are
        grabbed with scp, protected by a 5 second timeout. Fine for a couple
        of cameras.
      
      * Add web page stuff to display webcams, linked from the robot mape page.
      
      Permission to view the webcams is currently admin, or in a project that is
      allowed to use a robot. We can tighten this up later as needed.
      d46902e1
    • Jay Lepreau's avatar
      Make it clear to the poor admin (eg, Jay) how he can recover from thinking · 6841d600
      Jay Lepreau authored
      that text will get emailed with the "postpone" menu selection.
      6841d600
    • Timothy Stack's avatar
      · 89bf0a7f
      Timothy Stack authored
      A bunch of engineering on the robot code.  I'm sure I've broken something,
      but the majority of it is done and I wanted to get a checkpoint in.
      
      	* GNUmakerules: Add rpcgen rules.
      
      	* Makeconf.in: Add PATH and host_cpu variables so
      	cross-compilation works properly.  Add JAVAC and JAR for java
      	compilation.  Add BRAINSTEM_DIR that refers to a brainstem build
      	directory to be used for the robot build.
      
      	* configure, configure.in: Prepend the arm cross-compile dir to
      	PATH.  Detect java for building applets.  Add --with-brainstem to
      	specify the brainstem build dir.  Add --enable-mezzanine to turn
      	on the mezzanine build.
      
      	* robots/GNUmakefile.in: Add client target that builds the
      	subdirs.
      
      	* robots/emc/GNUmakefile.in, robots/emc/emcd.h, robots/emc/emcd.c,
      	test_emcd.sh.in, robots/rmcd/GNUmakefile.in, robots/rmcd/rmcd.c,
      	robots/rmcd/test_rmcd.sh.in, robots/vmcd/test_vmc-client.sh.in,
      	robots/vmcd/test_vmcd.sh.in, robots/vmcd/test_vmcd2.sh,
      	robots/vmcd/test_vmcd3.sh, robots/vmcd/test_vmcd4.sh,
      	robots/vmcd/vmc-client.c, robots/vmcd/vmcd.c: Updates for the mtp
      	switch to using rpcgen.
      
      	* robots/emc/test_emcd.config: Restore missing config line.
      
      	* robots/mtp/GNUmakefile.in, robots/mtp/global_bound.java,
      	robots/mtp/mtp.h, robots/mtp/mtp.c, robots/mtp/mtp.java,
      	robots/mtp/mtp.x, robots/mtp/mtp_command_goto.java,
      	robots/mtp/mtp_command_stop.java, robots/mtp/mtp_config_rmc.java,
      	robots/mtp/mtp_config_vmc.java, robots/mtp/mtp_control.java,
      	robots/mtp/mtp_dump.c, robots/mtp/mtp_garcia_telemetry.java,
      	robots/mtp/mtp_opcode_t.java, robots/mtp/mtp_packet.java,
      	robots/mtp/mtp_payload.java, robots/mtp/mtp_recv.c,
      	robots/mtp/mtp_request_id.java,
      	robots/mtp/mtp_request_position.java,
      	robots/mtp/mtp_robot_type_t.java, robots/mtp/mtp_role_t.java,
      	robots/mtp/mtp_send.c, robots/mtp/mtp_status_t.java,
      	robots/mtp/mtp_telemetry.java, robots/mtp/mtp_update_id.java,
      	robots/mtp/mtp_update_position.java, robots/mtp/robot_config.java,
      	robots/mtp/robot_position.java, robots/mtp/test_mtp.sh: Replace
      	hand-generated stubs with xdr stubs for C and java.  Java stubs
      	were generated by "remotetea's" jrpcgen.
      
      	* robots/primotion/GNUmakefile.in,
      	robots/primotion/buttonManager.hh,
      	robots/primotion/buttonManager.cc, robots/primotion/dashboard.hh,
      	robots/primotion/dashboard.cc, robots/primotion/flash-user-led.cc,
      	robots/primotion/garcia-pilot.cc, robots/primotion/garciaUtil.hh,
      	robots/primotion/garciaUtil.cc, robots/primotion/ledManager.hh,
      	robots/primotion/ledManager.cc,
      	robots/primotion/pilotButtonCallback.hh,
      	robots/primotion/pilotButtonCallback.cc,
      	robots/primotion/pilotClient.hh, robots/primotion/pilotClient.cc,
      	robots/primotion/watch-user-button.cc,
      	robots/primotion/wheelManager.hh,
      	robots/primotion/wheelManager.cc: Replace gorobot with
      	garcia-pilot, a beefed up daemon for controlling the robots.
      	Improvements include: making use of the user LED and button to
      	give some feedback and let the wrangler run a test sequence,
      	reboot, and shutdown the robot; Logging of the battery level, how
      	often the robot has moved and for how long, and the distance
      	traveled; telemetry is sent back to emulab clients; movements are
      	now just pivot-move instead of pivot-move-pivot, since the second
      	pivot ends up being extra work most of the time; the robot will
      	move backwards to cut down on the amount of rotation; and just
      	generic cleanups to the code.
      
      	* robots/primotion/garcia.config: The configuration file currently
      	used on the garcias.
      
      	* www/GNUmakefile.in: Add garcia-telemetry subdir to the build.
      
      	* www/dbdefs.php3.in: Add TBNodeClass and TBNodeStatus functions.
      
      	* www/garcia-telemetry/Base64.java,
      	www/garcia-telemetry/GNUmakefile.in,
      	www/garcia-telemetry/GarciaTelemetry.java,
      	www/garcia-telemetry/UpdateThread.java,
      	www/garcia-telemetry/main.xml: A telemetry applet for the garcia,
      	it displays readouts for the various sensors and other bits of data
      	gathered by the garcia-pilot daemon.  Hopefully, it will make a
      	handy debugging tool.
      
      	* www/garcia-telemetry.jar, www/mtp.jar, www/oncrpc.jar,
      	www/thinlet.jar: Java jars used by the robot telemetry applet.
      
      	* www/servicepipe.php3: A slightly enhanced version of
      	ledpipe.php3 that can be used for other services, like robot
      	telemetry.
      
      	* www/shownode.php3: Add "Show Telemetry" menu item to robot
      	nodes.
      
      	* www/telemetry.php3: Telemetry page for the garcia-telemetry
      	applet.
      89bf0a7f
  8. 07 Jan, 2005 1 commit
  9. 06 Jan, 2005 1 commit
    • Leigh Stoller's avatar
      A bunch of boot changes. Read carefully. · 94ccc3f4
      Leigh Stoller authored
      * Add boot_errno to the nodes table so that nodes can report in a
        subcode to indicate what went wrong. At present, we do not report any
        real error codes; that is going to take some time to work out since it
        will reqiure a bunch of changes to the boot scripts.
      
      * Add new table node_bootlogs to store logs provided by the nodes. Not
        a full console log, but a log of the tmcd client side part. We can
        make it a full log if we want though; just means mucking about with
        the boot phase a bit.
      
      * Add new state transition to NORMALv2 and PCVM state machines. "TBFAILED"
        is a new state that is sent (after TBSETUP) if a node fails somewhere in
        the tmcd client side.
      
      * Change TBNodeStateWait() to take a list of states (instead of single
        state) and an optional pass by reference parameter to return the actual
        state that the node landed in. Change all calls to TBNodeStateWait() of
        course.
      
      * Change os_setup (and libreboot in wait mode) to look for both TBFAILED
        and ISUP. If a TBFAILED event is seen, we can terminate the wait early
        and not retry os_setup on physical nodes (although still retry virtual
        nodes). The nice thing about this is that the wait should terminate much
        earlier (rather then waiting for timeout), especially for virtual nodes
        which can take a really long time when there are a couple of hundred.
      
      * Add new routines dobooterrno() and dobootlog() to tmcd. Bump version
        number and increase the buffer size to allow for the larger packets that
        a console log wikk generate (added MAXTMCDPACKET variable, set to 0x4000).
      
      * Add new -f option to tmcc to specify a datafile to send along as the last
        argument to tmcd. This is more pleasing then trying to send a console log
        in on the command line. For example: "tmcc -f /tmp/log BOOTLOG" will send
        a BOOTLOG command along with the contents of /tmp/log.
      
        Also close the write side of the pipe so that server sees EOF on
        read. See aside comment below.
      
      * Changes to rc.bootsetup:
           1. Use perl tricks to capture all output, duping to the console and to
              a log file in /var/emulab/logs.
           2. On any error, send a status code (boot_errno) and the bootlog to
              tmcd.
           3. Generate a TBFAILED state transition.
      
      * Changes to rc.injail:
           1. Same as rc.bootsetup, but do not send log files; that would pummel
              boss. Leave them on the physical node.
      
      * Change vnodesetup (which calls mkjail) to watch for any error and send a
        TBFAILED state transition. This should catch almost all errors, and
        dramatically reduce waiting when something fails.
      
      * Changes to rc.cdboot are essentially the same as rc.bootsetup, although a
        bootlog is sent all the time (success or failure), and I do not generate
        a boot_errno yet. Also, instead of TBFAILED, generate a PXEFAILED state
        since the CDROM is actually operating within the PXEFBSD opmode. I have
        yet to work this into the rest of the system though; waiting to get a new
        CD built and actually experiment with it.
      
      * Add new menu option and web page to display the node bootlog. We store
        only the lastest bootlog, but maybe someday store more then one. Display
        boot_errno on node page.
      
      Aside: I made a big mistake in the tmcd protocol; I did not envision
      passing more then a small amount of data (one fragment) and so I do not
      include a record terminator (ie: close of the write side on the client
      sends EOF) or a size field at the beginning. No big deal since small
      requests are sent in one fragment and the server sees the entire
      thing. Well, with a large console log, that will end up as multiple
      fragments, and the server will often not get the entire thing on the first
      read, and there are no subsequent reads (with no EOF or known size, it
      would block forever). Well, fixing this in a backwards compatable manner
      (for old images) was way too much pain. Instead, tmcc now closes the write
      side, and the server does subsequent reads *only* in the new dobbootlog()
      routine. Note that it *is* possible to fix this in a backwards compatable
      manner, but I did not want to go down that path just yet.
      94ccc3f4
  10. 03 Jan, 2005 5 commits
  11. 27 Dec, 2004 1 commit
  12. 22 Dec, 2004 1 commit
  13. 21 Dec, 2004 1 commit
    • Leigh Stoller's avatar
      Rework old XMLRPC code that I stuck into defs.php3 a long time ago, · 98d2ab5f
      Leigh Stoller authored
      but never made use of. Moved to its own file (www/xmlrpc.php3.in)
      and made to be more like the perl library I did a couple of months ago,
      that presents an interface to an sslxmlrpc server, via the sslxmlrpc
      client program operating in "raw" mode (takes raw xml on stdin, and
      returns raw xml on stdout).
      
      Added ELABINELAB code to nodetipacl.php3 so that you can click on
      console icon on an inner emulab web page, and it will ask the outer
      emulab sslxmlrpc server for the stuff it needs, and return that to the
      user.
      98d2ab5f
  14. 16 Dec, 2004 10 commits
    • Leigh Stoller's avatar
      Slight improvement. · c8c996c2
      Leigh Stoller authored
      c8c996c2
    • Russ Fish's avatar
      Describe setting the rdesktop size. · ff400e49
      Russ Fish authored
      ff400e49
    • Russ Fish's avatar
      1280x1024 is more normal than 1200x1024 for rdesktop. · df4255df
      Russ Fish authored
      You can specify any display resolution you want; it doesn't have to be
      one of the "normal" ones.  And you can switch back and forth by just starting
      a new rdesktop and "grabbing" the rlogin session away from the previous one.
      
      But once an rdesktop is started up, its display resolution is fixed.  If you make it
      smaller than the previous one, it will push your windows around to fit.
      df4255df
    • Leigh Stoller's avatar
      Add a special overlay icon for elabinelab to make it very clear which · c8bd673f
      Leigh Stoller authored
      web interface you are talking to!
      c8bd673f
    • Robert Ricci's avatar
      Add support (admins only for now) for restarting the event system via · 1d13cde6
      Robert Ricci authored
      the web interface.
      1d13cde6
    • Leigh Stoller's avatar
      Add dashed line to indicate where vision system stops cause camera 0 · bef556f1
      Leigh Stoller authored
      is not currently working. Simple revert to previous revision when
      camera is fixed.
      bef556f1
    • Robert Ricci's avatar
      Strip down the moteleds page a bit using the 'view' options to · 17cc2489
      Robert Ricci authored
      PAGEHEADER() and PAGEFOOTER(). Pop it up in a new window.
      17cc2489
    • Robert Ricci's avatar
    • Leigh Stoller's avatar
      The panic button ... · 87dd2e60
      Leigh Stoller authored
      * tbsetup/panic.in: New backend script to implement the panic button
        feature. When used, it will cut the severe the connection to the
        firewall node by using snmpit to disable the port. Sets the panic
        bit (and date) in the experiments table, and changes the state of
        the experiment from "active" to "paniced" to ensure that the
        experiment cannot be messed with (swapped out or modified). Sends
        email to tbops when the panic button is pressed.
      
        Used with -r option, reverses the above. State is set back to
        active, the panic bit is cleared, and the port is renabled with
        snmpit.
      
      * tbsetup/tbswap.in: During swapout, a firewalled experiment that has
        been paniced will get a cleaning; The nodes are powered off, then
        the osids for all the nodes are reset (with os_select) so that they
        will boot the MFS, and then the nodes are powered on. Then the
        control network is turned back on, and then I wait for the nodes to
        reboot (this is simply cause we do not record in the DB that a node
        is turned off, and if I do not wait, the reload daemon will end
        hitting the power button again if they do not reboot in time. We can
        fix this later.
      
        I am not planning to apply this to general firewalled experiments
        yet as the power cycling is going to be hard on the nodes, so would
        rather that we at least have a 1/2 baked plan before we do that.
      
      * www/showexp.php3: If experiment is firewalled, show the Panic
        Button, linked to the panic button web script. If the experiment has
        already had the panic button pressed, show a big warning message and
        explain that user must talk to tbops to swap the experiment out.
        Also fiddle with menu options so that the terminate link is gone,
        and the swap link is visible only in admin mode. In other words, only
        an admin person can swap an experiment once it is paniced. And of
        course, an admin person can the backend panic script above with the
        -r option, but thats not something to be done lightly.
      
      * db/libdb.pm.in: Add "paniced" as an experiment state (EXPTSTATE_PANICED).
        Add utility functions: TBExptSetPanicBit(), TBExptGetPanicBit(), and
        TBExptClearPanicBit().
      
      * tbsetup/swapexp.in: Minor state fiddling so that an experiment can
        be swapped while in paniced state, but only when in admin mode. Also
        clear the panic bit when experiment is swapped out.
      
      * www/dbdefs.php3.in: Add "paniced" as an experiment state. Add a
        utility function TBExptFirewall() to see if experiment is firewalled.
      
      * www/panicbutton.php3: New web script to invoke the backend panic
        script mentioned above, after the usual confirm song and dance.
      
      * www/panicbutton.gif: New gif of a red panic button that I stole off
        the net. If anyone has sees/has a better one, feel free to replace
        this one.
      
      * utils/node_statewait.in: Add -s option so that I can pass in the
        state I want to wait for (used from tbswap above to wait for nodes
        to reach ISUP after power on).
      87dd2e60
    • Leigh Stoller's avatar
      Add a few obstacles. Very rough ... · 9e7b2287
      Leigh Stoller authored
      9e7b2287
  15. 15 Dec, 2004 2 commits
  16. 14 Dec, 2004 5 commits
  17. 13 Dec, 2004 1 commit