- 01 Oct, 2018 1 commit
-
-
Leigh B Stoller authored
1. Split the resource stuff (where we ask for an advertisement and process it) into a separate script, since that takes a long time to cycle through cause of the size of the ads from the big clusters. 2. On the monitor, distinguish offline (nologins) from actually being down. 3. Add a table to store changes in status so we can see over time how much time the aggregates are usable.
-
- 29 Aug, 2018 1 commit
-
-
Leigh B Stoller authored
-
- 08 Aug, 2018 1 commit
-
-
Leigh B Stoller authored
* I started out to add just deferred aggregates; those that are offline when starting an experiment (and marked in the apt_aggregates table as being deferable). When an aggregate is offline, we add an entry to the new apt_deferred_aggregates table, and periodically retry to start the missing slivers. In order to accomplish this, I split create_instance into two scripts, first part to create the instance in the DB, and the second (create_slivers) to create slivers for the instance. The daemon calls create_slivers for any instances in the deferred table, until all deferred aggregates are resolved. On the UI side, there are various changes to deal with allowing experiments to be partially create. For example used to wait till we have all the manifests until showing the topology. Now we show the topo on the first manifest, and then add them as they come in. Various parts of the UI had to change to deal with missing aggregates, I am sure I did not get them all. * And then once I had that, I realized that "scheduled" experiments was an "easy" addition, its just a degenerate case of deferred. For this I added some new slots to the tables to hold the scheduled start time, and added a started stamp so we can distinguish between the time it was created and the time it was actually started. Lots of data. On the UI side, there is a new fourth step on the instantiate page to give the user a choice of immediate or scheduled start. I moved the experiment duration to this step. I was originally going to add a calendar choice for termination, but I did not want to change the existing 16 hour max duration policy, yet.
-
- 09 Jul, 2018 1 commit
-
-
Leigh B Stoller authored
hand). Also add enable sitevar since we run this only on clusters that support portstats on the control network.
-
- 30 Oct, 2017 1 commit
-
-
Hussamuddin Nasir authored
-
- 25 Oct, 2017 2 commits
-
-
Leigh B Stoller authored
work, but tested here okay. Who knows, does not affect us. This reverts commit 0e4b83ef.
-
Leigh B Stoller authored
it does not hang up.
-
- 12 Nov, 2016 1 commit
-
-
Leigh B Stoller authored
-
- 17 Oct, 2016 1 commit
-
-
Leigh B Stoller authored
BOOTINFO_EVENTS=0 (not sending PXEBOOTING/BOOTING from bootinfo).
-
- 19 May, 2016 2 commits
-
-
Leigh B Stoller authored
Note that the shared-node-listener should be doing this, but this is easier.
-
Leigh B Stoller authored
-
- 11 Mar, 2016 1 commit
-
-
Mike Hibler authored
So mike doesn't get worried...
-
- 29 Jan, 2016 2 commits
- 27 Jan, 2016 1 commit
-
-
Leigh B Stoller authored
receiving Geni style events from event enabled clusters. On clusters where CLUSTER_PORTAL is defined, start up an SSL enabled pubsub notification forwarder, to send geni style events to the portal pubsubd.
-
- 16 Dec, 2015 1 commit
-
-
Gary Wong authored
-
- 08 Dec, 2015 1 commit
-
-
Gary Wong authored
-
- 12 Feb, 2015 1 commit
-
-
Leigh B Stoller authored
daemon.
-
- 19 Aug, 2014 1 commit
-
-
Leigh B Stoller authored
every now and then. Seems to happen a lot on the racks, and there are lots of them.
-
- 25 Apr, 2014 1 commit
-
-
Mike Hibler authored
-
- 19 Feb, 2014 1 commit
-
-
Mike Hibler authored
-
- 23 Jan, 2014 1 commit
-
-
Mike Hibler authored
Currently it is configured (hardwired) to run every 15 minutes, even that may be too frequent as things don't happen too fast in lease-world.
-
- 28 Aug, 2013 1 commit
-
-
Leigh B Stoller authored
-
- 09 Aug, 2013 1 commit
-
-
Leigh B Stoller authored
-
- 22 Jul, 2013 1 commit
-
-
Leigh B Stoller authored
-
- 26 Sep, 2012 1 commit
-
-
Gary Wong authored
-
- 07 Aug, 2012 1 commit
-
-
Mike Hibler authored
Otherwise, pubsubd won't start til after the testbed startup. Since checknodes_daemon wants to send an event, it will hang forever if pubsubd is not running.
-
- 22 Jun, 2012 1 commit
-
-
Mike Hibler authored
-
- 15 Mar, 2012 1 commit
-
-
Leigh B Stoller authored
with testbed-control, and then I reboot boss, I do not want the daemons to start up until I call testbed-control again.
-
- 07 Nov, 2011 2 commits
-
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
- 18 Jan, 2011 1 commit
-
-
Mike Hibler authored
No more frisbeelauncher or assorted subboss frisbee stuff.
-
- 11 Jan, 2011 1 commit
-
-
Mike Hibler authored
More work on the hierarchical configuration for subboss. When doing host-based authentication, allow client to pass an explicit host (IP) to the mserver. If the mserver is configured to allow it, that IP is used for authenticating the request instead of the caller's IP. Add a default ("null") configuration so the mserver can operate out-of-the-box with no config file. The goal of these two changes is for an mserver instance with the default config and a proxy option to serve the needs of a subboss node (i.e., so no explicit configuration will be needed).
-
- 23 Jun, 2010 1 commit
-
-
Leigh B Stoller authored
currently does is probe the known and enabled CMs and every 24 hours, to see what version they are running (which says if they are online) and then sends email to geni-dev-utah.
-
- 18 May, 2010 1 commit
-
-
Leigh B Stoller authored
-
- 10 May, 2010 1 commit
-
-
Leigh B Stoller authored
that they all write proper pid files in /var/run. You can not actually "stop" the testbed daemons from the command line.
-
- 22 Dec, 2009 1 commit
-
-
Leigh B. Stoller authored
-
- 05 Aug, 2009 1 commit
-
-
Leigh B. Stoller authored
-
- 26 Jan, 2009 1 commit
-
-
Leigh B. Stoller authored
-
- 08 Jan, 2009 1 commit
-
-
Leigh B. Stoller authored
-