- 12 Nov, 2014 1 commit
-
-
Leigh B Stoller authored
-
- 11 Nov, 2014 13 commits
-
-
Kirk Webb authored
-
Mike Hibler authored
I was attempting to read back any last words the program might have uttered, but if it said nothing, we would hang. I would not have expected this behavior from a pipe (actually, socketpair) when the other end has gone away! But, make it non blocking before we read to be safe.
-
Leigh B Stoller authored
-
Mike Hibler authored
I was expanding a global list in a loop for every node. So for each node, I was finding all the delta images in the ever-growing list and adding their dependencies (again!) making the list even larger. In an experiment loading a two-level delta image on 8 nodes, the list included 40+ copies of the same three images to load by the time we got to the last node. However, no node attempted to load all those images because tmcd exceeded its reply buffer size on the "loadinfo" call and would not return anything. Of course, by then we had computed a max wait time based on image.max_wait * 45 so the experiment suffered a slow, lingering death even though the nodes were not doing anything. Beware, I do not know if I got the "access key" code right for remote nodes. Not even sure if we use that path anymore. I attempted to fix it in libosload, I did not even try in libosload_new.
-
Kirk Webb authored
-
Kirk Webb authored
Taints applied to a node are now extracted from the database and can be used wherever taint checking needs to happen in tmcd. I've applied them to the "accounts" command. tmcd will never set the "root" flag for non-admin users on OSes tainted with "useronly". It won't return any regular user accounts on nodes with the "blackbox" taint (but will pass along admin accounts).
-
Kirk Webb authored
* Do not "reset" taint states to match partitions after OS load. Encumber node with any additional taint states found across the OSes loaded on a node's partitions (union of states). Change the name of the associated Node object method to better represent the functionality. * Clear all taint states when a node exits "reloading" When the reload_daemon is finished with a node and ready to release it, it will now clear any/all taint states set on the node. This is the only automatic way to have a node's taint states cleared. Users cannot clear node taint states by os_load'ing away all tainted partitions after this commit; nodes must travel through reloading to get cleared.
-
Mike Hibler authored
-
Mike Hibler authored
In our usage info, let's not count nodes in hwdown and hwbroken as "in use", as it make our node utilization overly high (well, at least for pc600s and pc850s!) Also, a couple more hacks to try to work around inconsistencies in the node_history data. We really just need to fix up the history records!
-
Leigh B Stoller authored
-
Leigh B Stoller authored
This is not exposed to users, the main reason for this is so that the name space for leases (datasets) is per-group instead of per-project. We need this when creating datasets via the geni interface (backend to APT), since all leases are created in the holding project. Without a subgroup, we would run into name collisions on the backend. It also gives us finer access permission control for the same reason. Note that I yanked out the lease cache from Lease.pm (not worth the trouble), and I expanded Lookup to allow for the usual variety of possibilities that we allow in other Lookup methods.
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
- 10 Nov, 2014 2 commits
-
-
Leigh B Stoller authored
-
Mike Hibler authored
When locating the root device, if a BSD disk partition fills the entire DOS partition, then Linux will not create a separate /dev entry for it. In that case, we use the DOS partition device. Also, a couple of changes to resync with BSD slicefix.
-
- 09 Nov, 2014 2 commits
-
-
Mike Hibler authored
We still use realpath to validate the path up front, but we pass the original (DB) path on to the client-side. Passing the resolved path was wrong anyway for clients that write images across NFS, because the path the client uses could be different than that computed on the server (e.g., /proj/foo vs. /.amd_mnt/ops/proj/foo) due to the way mounts are done. Note that the server will again validate the client-provided path, so if someone were to mess with a symlink in the path between when create_image verifies it and when it gets used, there is still no danger. This will probably eliminate the need for the AMD hack, but I'll leave it just to be safe.
-
Mike Hibler authored
-
- 08 Nov, 2014 1 commit
-
-
Mike Hibler authored
-
- 07 Nov, 2014 4 commits
-
-
Mike Hibler authored
If an available partition device (aka, the 4th partition on the system disk) represents less than 5% of the spare space we have found, ignore it. This will allow us to continue to use the 4th partition on the system disk of the d710s (450GB or so) and the second disk (250GB), but not use the 2nd partition (3GB), which would make us thrash about on the system disk even more than usual. Mostly this is for the new HP server boxes, so it doesn't pick up the 10GB left over on the (virtual) system disk when we have 21TB available on the second (virtual) disk. Another hack til blockstores rule the world...
-
Mike Hibler authored
-
Mike Hibler authored
-
Leigh B Stoller authored
-
- 06 Nov, 2014 2 commits
-
-
Leigh B Stoller authored
treated like normal hosts (tmcd already was changed). This is fine for XEN, but will break OpenVZ, but we can burn that bridge later.
-
Leigh B Stoller authored
-
- 05 Nov, 2014 12 commits
-
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
Kirk Webb authored
Also loads a couple of Infiniband modules so that Infiniband tools work properly.
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
Jonathon Duerig authored
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
- 04 Nov, 2014 3 commits
-
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-