- 22 Jan, 2005 1 commit
-
-
Timothy Stack authored
-
- 18 Jan, 2005 1 commit
-
-
Leigh B. Stoller authored
The last part is the stuff to hook it in from assign_wrapper, and some additional support in assign that Rob is adding for me. This comment is from the top of new file db/libadminctrl.pm.in and describes everything in detail. # Admission control policies. These are the ones I could think of, although # not all of these are implemented. # # * Number of experiments per type/class (only one expt using robots). # # * Number of experiments per project # * Number of experiments per subgroup # * Number of experiments per user # # * Number of nodes per project (nodes really means pc testnodes) # * Number of nodes per subgroup # * Number of nodes per user # # * Number of nodes of a class per project # * Number of nodes of a class per group # * Number of nodes of a class per user # # * Number of nodes of a type per project # * Number of nodes of a type per group # * Number of nodes of a type per user # # * Number of nodes with attribute(s) per project # * Number of nodes with attribute(s) per group # * Number of nodes with attribute(s) per user # # So we have group (pid/gid) policies and user policies. These are stored # into two different tables, group_policies and user_policies, indexed in # the obvious manner. Each row of the table defines a count (experiments, # nodes, etc) and a type of thing being counted (experiments, nodes, types, # classes, etc). When we test for admission, we look for each matching row # and test each condition. All conditions must pass. No conditions means a # pass. There is also some "auxdata" which holds extra information needed # for the policy (say, the type of node being restricted). # # uid: a uid # policy: 'experiments', 'nodes', 'type', 'class', 'attribute' # count: a number # auxdata: a string (optional) # # Example: A user policy of ('mike', 'nodes', 10) says that poor mike is # not allowed to have more 10 nodes at a time, while ('mike', 'type', # '10', 'pc850') says that mike cannot allocate more than 10 pc850s. # # The group_policies table: # # pid: a pid # gid: a gid # policy: 'experiments', 'nodes', 'type', 'class', 'attribute' # count: a number # auxdata: a string (optional) # # Example: A project policy of ('testbed', 'testbed', 'experiments', 10) # says that the testbed project may not have more then 10 experiments # swapped in at a time, while ('testbed', 'TG1', 'nodes', 10) says that the # TG1 subgroup of the testbed project may not use more than 10 nodes at # time. # # In addition to group and user policies (which are policies that apply to # specific users/projects/subgroups), we also need policies that apply to # all users/projects/subgroups (ie: do not want to specify a particular # restriction for every user!). To indicate such a policy, we use a special # tag in the tables (for the user or pid/gid): # # '+' - The policy applies to all users (or project/groups). # # Example: ('+','experiments',10) says that no user may have more then 10 # experiments swapped in at a time. The rule overrides anything more # specific (say a particular user is restricted to 20 experiments; the above # rule overrides that and the user (all users) is restricted to 10. # # Sometimes, you want one of these special rules to apply to everyone, but # *allow* it to be overridden by a more specific rule. For that we use: # # '-' - The policy applies to all users (or project/groups), # but can be overridden by a more specific rule. # # Example: The rules: # # ('-','type',0, 'garcia') # ('testbed', 'testbed', 'type', 10, 'garcia') # # says that no one is allowed to allocate garcias, unless there is specific # rule that allows it; in this case the testbed project can allocate them. # # There are other global policies we would like to enforce. For example, # "only one experiment can be using the robot testbed." Encoding this kind # of policy is harder, and leads down a path that can get arbitrarily # complex. Tha path leads to ruination, and so we want to avoid it at # all costs. # # Instead we define a simple global policies table that applies to all # experiments currently active on the testbed: # # policy: 'nodes', 'type', 'class', 'attribute' # test: 'max', others I cannot think of right now ... # count: a number # auxdata: a string # # Example: A global policy of ('nodes', 'max', 10, '') say that the maximum # number of nodes that may be allocated across the testbed is 10. Thats not # a very realistic policy of course, but ('type', 'max', 1, 'garcia') says # that a max of one garcia can be allocated across the testbed, which # effectively means only one experiment will be able to use them at once. # This is of course very weak, but I want to step back and give it some # more thought before I redo this part. # # Is that clear? Hope so, cause it gets more complicated. Some admission # control tests can be done early in the swap phase, before we really do # anything (before assign_wrapper). Others (type and class) tests cannot # be done here; only assign can figure out how an experiment is going to map # to physical nodes (remember virtual types too), and in that case we need # to tell assign what the "constraints" are and let it figure out what is # possible. # # So, in addition to the simple checks we can do, we also generate an array # to return to assign_wrapper with the maximum counts of each node type and # class that is limited by the policies. assign_wrapper will dump those # values into the ptop file so that assign can enforce those maximum values # regardless of what hardware is actually available to use. As per discussion # with Rob, that will look like: # # set-type-limit <type> <limit> # # and assign will spit out a new type of violation that assign_wrapper will # parse. # # NOTES: # # 1) Admission control is skipped in admin mode; returns okay. # 2) Admission control is skipped when the pid is emulab-ops; returns okay. # 3) When calculating current usage, nodes reserved to emulab-ops are # ignored. # 4) The sitevar "swap/use_admission_control" controls the use of admission # control; defaults to 1 (on). # 5) The current policies can be viewed in the web interface. See # https://www.emulab.net/showpolicies.php3 # 6) The global policy stuff is weak. I plan to step back and think about it # some more before redoing it, but it will tide us over for now. #
-
- 15 Jan, 2005 1 commit
-
-
Timothy Stack authored
timeline or sequence.
-
- 13 Jan, 2005 2 commits
-
-
Leigh B. Stoller authored
-
Timothy Stack authored
-
- 12 Jan, 2005 2 commits
-
-
Leigh B. Stoller authored
table that will prevent an experiment from being swapped/modified. The toggle is on the showexp page, and the toggle is *not* admin over-ridable; you must turn the toggle off (and of course, you must be an admin to do that).
-
Leigh B. Stoller authored
out of the reserved table. Mostly this happens in nfree and nalloc, but there a couple of other moves, in libdb and in the reload daemon. The uid and experiment are stored, long with a timestamp.
-
- 11 Jan, 2005 3 commits
-
-
Robert Ricci authored
that is both in the experimental and control nets.
-
Mike Hibler authored
Also, reflect new way of populating default_firewall_rules (not done here anymore, done in the new firewall subdirectory)
-
Leigh B. Stoller authored
* New database table to store obstacles, in the usual coord system; x1,y1, is the upper left corner. * New web page to dump the entire obstacle list https://www.emulab.net/obstacle_list.php3 * New web page to dump a single obstacle, referenced by the above list page, and by the floormap generator. * Hack up the floormap code to add obstacles to the areamap, so that when you mouse over them, you get a ballon showing the description, and a link to the above mentioned page.
-
- 10 Jan, 2005 1 commit
-
-
Leigh B. Stoller authored
* Add new DB table "webcams" which hold the id of the webcam, the server it is attached to, and the last update time. * Add new sitevars webcam/anyone_can_view and webcam/admins_can_view. Should be obvious what they mean. * Add trivial script grabwebcams (invoked from cron) to grab the images from the servers and stash in /usr/testbed/webcams. The images are grabbed with scp, protected by a 5 second timeout. Fine for a couple of cameras. * Add web page stuff to display webcams, linked from the robot mape page. Permission to view the webcams is currently admin, or in a project that is allowed to use a robot. We can tighten this up later as needed.
-
- 07 Jan, 2005 1 commit
-
-
Leigh B. Stoller authored
-
- 06 Jan, 2005 1 commit
-
-
Leigh B. Stoller authored
* Add boot_errno to the nodes table so that nodes can report in a subcode to indicate what went wrong. At present, we do not report any real error codes; that is going to take some time to work out since it will reqiure a bunch of changes to the boot scripts. * Add new table node_bootlogs to store logs provided by the nodes. Not a full console log, but a log of the tmcd client side part. We can make it a full log if we want though; just means mucking about with the boot phase a bit. * Add new state transition to NORMALv2 and PCVM state machines. "TBFAILED" is a new state that is sent (after TBSETUP) if a node fails somewhere in the tmcd client side. * Change TBNodeStateWait() to take a list of states (instead of single state) and an optional pass by reference parameter to return the actual state that the node landed in. Change all calls to TBNodeStateWait() of course. * Change os_setup (and libreboot in wait mode) to look for both TBFAILED and ISUP. If a TBFAILED event is seen, we can terminate the wait early and not retry os_setup on physical nodes (although still retry virtual nodes). The nice thing about this is that the wait should terminate much earlier (rather then waiting for timeout), especially for virtual nodes which can take a really long time when there are a couple of hundred. * Add new routines dobooterrno() and dobootlog() to tmcd. Bump version number and increase the buffer size to allow for the larger packets that a console log wikk generate (added MAXTMCDPACKET variable, set to 0x4000). * Add new -f option to tmcc to specify a datafile to send along as the last argument to tmcd. This is more pleasing then trying to send a console log in on the command line. For example: "tmcc -f /tmp/log BOOTLOG" will send a BOOTLOG command along with the contents of /tmp/log. Also close the write side of the pipe so that server sees EOF on read. See aside comment below. * Changes to rc.bootsetup: 1. Use perl tricks to capture all output, duping to the console and to a log file in /var/emulab/logs. 2. On any error, send a status code (boot_errno) and the bootlog to tmcd. 3. Generate a TBFAILED state transition. * Changes to rc.injail: 1. Same as rc.bootsetup, but do not send log files; that would pummel boss. Leave them on the physical node. * Change vnodesetup (which calls mkjail) to watch for any error and send a TBFAILED state transition. This should catch almost all errors, and dramatically reduce waiting when something fails. * Changes to rc.cdboot are essentially the same as rc.bootsetup, although a bootlog is sent all the time (success or failure), and I do not generate a boot_errno yet. Also, instead of TBFAILED, generate a PXEFAILED state since the CDROM is actually operating within the PXEFBSD opmode. I have yet to work this into the rest of the system though; waiting to get a new CD built and actually experiment with it. * Add new menu option and web page to display the node bootlog. We store only the lastest bootlog, but maybe someday store more then one. Display boot_errno on node page. Aside: I made a big mistake in the tmcd protocol; I did not envision passing more then a small amount of data (one fragment) and so I do not include a record terminator (ie: close of the write side on the client sends EOF) or a size field at the beginning. No big deal since small requests are sent in one fragment and the server sees the entire thing. Well, with a large console log, that will end up as multiple fragments, and the server will often not get the entire thing on the first read, and there are no subsequent reads (with no EOF or known size, it would block forever). Well, fixing this in a backwards compatable manner (for old images) was way too much pain. Instead, tmcc now closes the write side, and the server does subsequent reads *only* in the new dobbootlog() routine. Note that it *is* possible to fix this in a backwards compatable manner, but I did not want to go down that path just yet.
-
- 03 Jan, 2005 1 commit
-
-
Leigh B. Stoller authored
battery_voltage float default NULL, battery_percentage float default NULL, battery_timestamp int(10) unsigned default NULL,
-
- 21 Dec, 2004 1 commit
-
-
Robert Ricci authored
to assume that the leader of a stack is the switch after which it was named - we can now name stacks things like 'Control' or 'Experiment'.
-
- 15 Dec, 2004 1 commit
-
-
Leigh B. Stoller authored
button pressed, and when.
-
- 14 Dec, 2004 1 commit
-
-
Russ Fish authored
from the Unix usr_pswd MD5 hash string.
-
- 13 Dec, 2004 3 commits
-
-
Russ Fish authored
-
Leigh B. Stoller authored
meter for the image.
-
Mike Hibler authored
-
- 09 Dec, 2004 1 commit
-
-
Robert Ricci authored
-
- 06 Dec, 2004 1 commit
-
-
Leigh B. Stoller authored
-
- 03 Dec, 2004 1 commit
-
-
Leigh B. Stoller authored
* Add security_level to experiments table. * Add a cross link between an experiment and its elabinelab container. This will like change at some point, but just messing around right now. * Add elabinelab flag, security level, and cross eid to experiment_stats table.
-
- 01 Dec, 2004 1 commit
-
-
Mike Hibler authored
ID in firewalls table:
-
- 18 Nov, 2004 1 commit
-
-
Mike Hibler authored
column in the reserved table and not the nodes table: Also fix a cut/paste error and renumbed some items, we went from 1.279 to 1.270 and started counting up again.
-
- 09 Nov, 2004 2 commits
-
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
archive (and restore) of news items. Add button at top to toggle the display of archived items. All of this is admin mode only; mere users see just the news.
-
- 05 Nov, 2004 1 commit
-
-
Leigh B. Stoller authored
assigned outer elab vlan ids.
-
- 01 Nov, 2004 1 commit
-
-
Leigh B. Stoller authored
used for ElabinElab.
-
- 26 Oct, 2004 1 commit
-
-
Mike Hibler authored
-
- 08 Oct, 2004 1 commit
-
-
Mike Hibler authored
This checkin adds the necessary NS and client-side changes. You get such a firewall by creating a firewall object and doing: $fw set-type ipfw2-vlan In addition to the usual firewall setup, it sets the firewall node command line to boot "/kernel.fw" which is an IPFW2-enabled kernel with a custom bridge hack. The client-side setup for firewalled nodes is easy: do nothing. The client-side setup for the firewall is more involved, using vlan devices and bridging and all sorts of geeky magic. Note finally that I don't yet have a decent set of default rules for anything other than a completely open firewall. The rules might be slightly different than for the "software" firewall since they are applied at layer2 (and we want them just to be applied at layer2 and not multiple times)
-
- 29 Sep, 2004 1 commit
-
-
Mike Hibler authored
-
- 17 Sep, 2004 1 commit
-
-
Leigh B. Stoller authored
Add elab_in_elab boolean to experiments table. Add inner_elab_role to virt_nodes table, which is one of boss,ops,node.
-
- 08 Sep, 2004 2 commits
-
-
Mike Hibler authored
nextosid mechinism of 1.114 making it possible to map a generic *-STD OSID based on the time in which an experiment is created. This provides backward compatibility for old experiments when the standard images are changed. The osid_map table lookup is triggered when the value of the nextosid field is set to 'MAP:osid_map'. The nextosid also continues to behave as before: if it contains a valid osid, that OSID value is used to map independent of the experiment creation time. The two styles can also be mixed, for example FBSD-JAIL has a nextosid of FBSD-STD which in turn is looked up and redirects to the osid_map and selects one of FBSD47-STD or FBSD410-STD depending on the time. CREATE TABLE osid_map ( osid varchar(35) NOT NULL default '', btime datetime NOT NULL default '1000-01-01 00:00:00', etime datetime NOT NULL default '9999-12-31 23:59:59', nextosid varchar(35) default NULL, PRIMARY KEY (osid,btime,etime) ) TYPE=MyISAM; Yeah, yeah, I'm using another magic date as a sentinel value. Tell ya what, in 7995 years, find out where I'm buried, dig me up, and kick my ass for being so short-sighted... The following commands are not strictly needed, they just give an example, default population of the table. They cause the standard images to be revectored through the table and then remapped, based on two time ranges, to the exact same image. Obviously, the second set would normally be mapped to a different set of images (say RHL90 and FBSD410): INSERT INTO osid_map (osid,etime,nextosid) VALUES \ ('RHL-STD','2004-09-08 08:59:59','emulab-ops-RHL73-STD'); INSERT INTO osid_map (osid,etime,nextosid) VALUES \ ('FBSD-STD','2004-09-08 08:59:59','emulab-ops-FBSD47-STD'); INSERT INTO osid_map (osid,btime,nextosid) VALUES \ ('RHL-STD','2004-09-08 09:00:00','emulab-ops-RHL73-STD'); INSERT INTO osid_map (osid,btime,nextosid) VALUES \ ('FBSD-STD','2004-09-08 09:00:00','emulab-ops-FBSD47-STD'); UPDATE os_info SET nextosid='MAP:osid_map' \ WHERE osname IN ('RHL-STD','FBSD-STD');
-
Leigh B. Stoller authored
-
- 01 Sep, 2004 2 commits
-
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
* SSL based server (sslxmlrpc_server.py) that wraps the existing Python classes (what we export via the existing ssh XMLRPC server). I also have a demo client that is analogous the ssh demo client (sslxmlrpc_client.py). This client looks for an ssl cert in the user's .ssl directory, or you can specify one on the command line. The demo client is installed on ops, and is in the downloads directory with the rest of the xmlrpc stuff we export to users. The server runs as root, forking a child for each connection and logs connections to /usr/testbed/log/sslxmlrpc.log via syslog. * New script (mkusercert) generates SSL certs for users. Two modes of operation; when called from the account creation path, generates a unencrypted private key and certificate for use on Emulab nodes (this is analagous to the unencrypted SSH key we generate for users). The other mode of operation is used to generate an encrypted private key so that the user can drag a certificate to their home/desktop machine. * New webpage (gensslcert.php3) linked in from the My Emulab page that allows users to create a certificate. The user is prompted for a pass phrase to encrypt the private key, as well as the user's current Emulab login password. mkusercert is called to generate the certificate, and the result is stored in the user's ~/.ssl directory, and spit back to the user as a text file that can be downloaded and placed in the users homedir on their local machine. * The server needs to associate a certificate with a user so that it can flip to that user in the child after it forks. To do that, I have stored the uid of the user in the certificate. When a connection comes in, I grab the uid out of the certificate and check it against the DB. If there is a match (see below) the child does the usual setgid,setgroups,setuid to the user, instantiates the Emulab server class, and dispatches the method. At the moment, only one request per connection is dispatched. I'm not sure how to do a persistant connection on the SSL path, but probably not a big deal right now. * New DB table user_sslcerts that stores the PEM formatted certificates and private keys, as well as the serial number of the certificate, for each user. I also mark if the private key is encrypted or not, although not making any use of this data. At the moment, each user is allowed to get one unencrypted cert/key pair and one encrypted cert/key pair. No real reason except that I do not want to spend too much time on this until we see how/if it gets used. Anyway, the serial number is used as a crude form of certificate revocation. When the connection is made, I suck the serial number and uid out of the certificate, and look for a match in the table. If cert serial number does not match, the connection is rejected. In other words, revoking a certificate just means removing its entry from the DB for that user. I could also compare the certificate itself, but I am not sure what purpose that would serve since that is what the SSL handshake is supposed to take of, right? * Updated the documentation for the XMLRPC server to mention the existence of the SSL server and client, with a pointer into the downloads directory where users can pick up the client.
-
- 25 Aug, 2004 1 commit
-
-
Mike Hibler authored
and default_firewall_rules.
-
- 23 Aug, 2004 1 commit
-
-
Robert Ricci authored
-
- 11 Aug, 2004 1 commit
-
-
Leigh B. Stoller authored
1.269: Add new table to generate a per virt_lan index for use with veth vlan tags. This would be so much easier if the virt_lans table had been split into virt_lans and virt_lan_members. Anyway, this table might someday become the per-lan table, with a table of member settings. This would reduce the incredible amount of duplicate info in virt_lans! CREATE TABLE virt_lan_lans ( pid varchar(12) NOT NULL default '', eid varchar(32) NOT NULL default '', idx int(11) NOT NULL auto_increment, vname varchar(32) NOT NULL default '', PRIMARY KEY (pid,eid,idx), UNIQUE KEY vname (pid,eid,vname) ) TYPE=MyISAM; This arrangement will provide a unique index per virt_lan, within each pid,eid. That is, it starts from 1 for each pid,eid. That is necessary since the limit is 16 bits, so a global index would quickly overflow. The above table is populated with: insert into virt_lan_lans (pid, eid, vname) select distinct pid,eid,vname from virt_lans;
-