1. 16 Dec, 2014 1 commit
  2. 12 Dec, 2014 2 commits
    • Mike Hibler's avatar
      Double the timeout period for waiting for the master server to reply. · a78a427e
      Mike Hibler authored
      This is a "temporary" hack to get around the single-threaded nature of
      the master server. That in itself is not the killer, it is that it
      sleeps for 2 seconds after it spawns a child process (frisbeed, frisbee
      or uploader). So if three images are requested simultaneously by a handful
      of clients, then it will take 6 seconds to get the frisbeed processes
      started, exceeding the 5 second timeout in the clients. So other clients
      in the queue after the three clients that are starting images will timeout
      before the master server handles them.
      a78a427e
    • Mike Hibler's avatar
      Add a "failstop" command line argument for frisbee MFS. · a271d365
      Mike Hibler authored
      If specified on the kernel command line in the pxelinux.cfg config,
      the init script will drop to a shell prompt when frisbee fails.
      With failstop off, it will instead report via "tmcc bootlog", wait
      a couple of seconds, and then reboot in order to try again.
      a271d365
  3. 11 Dec, 2014 1 commit
  4. 10 Dec, 2014 4 commits
  5. 09 Dec, 2014 3 commits
  6. 08 Dec, 2014 2 commits
  7. 04 Dec, 2014 3 commits
  8. 03 Dec, 2014 3 commits
  9. 02 Dec, 2014 6 commits
  10. 01 Dec, 2014 2 commits
    • Mike Hibler's avatar
      Rudimentary TRIM support. · bdde40fe
      Mike Hibler authored
      We pass through a flag in the tmcd loadinfo call to tell whether to attempt
      to do a TRIM when loading the disk (or after loading the disk). If TRIM=1
      then we do so.
      
      Since it is not clear from what I have read whether repeated TRIMming is
      a detriment to SSD life, we throttle it as follows:
      
      1. We don't TRIM at all unless the sitevariable general/bootdisk_trim_interval
         is non zero. If it is set, we will wait at least that many seconds after
         the previous TRIM before we do it again.
      
      2. We keep track of the last trim via the node_attribute "bootdisk_lasttrim"
         which is a unix timestamp of the last time that tmcd responded to a
         loadinfo request in which it returned TRIM=1.
      
      2. We track, on a per-node basis, whether the boot disk should be TRIMmed
         or not. If the node or node-type attribute "bootdisk_trim" is non-zero,
         we will attempt a trim if the interval has passed since the last trim.
      
      So, we never trim if the sitevariable is 0 (the default value). If it is
      non-zero, we only trim the boot disk of those nodes that have the node or
      node_type attribute set and only after a sufficient interval has passed.
      
      This does not address non-boot disks, but currently frisbee won't mess
      with any other disk anyway. Eventually, we will have to have per-disk or
      per-disktype attributes if we want to do this better.
      bdde40fe
    • Mike Hibler's avatar
      Setup ipod before going into pxewait. · 0f22833d
      Mike Hibler authored
      0f22833d
  11. 25 Nov, 2014 2 commits
  12. 23 Nov, 2014 1 commit
  13. 19 Nov, 2014 1 commit
    • Kirk Webb's avatar
      Sprinkle taint checks throughout tmcd to avert privilege escalation. · d9c27fac
      Kirk Webb authored
      Also add utility function to allow the node to get the exact details of
      the image it is running ('imageinfo').
      
      Some of the taint checks are rather heavy-handed presently.  Pretty much
      any vector that could be used by the user to do something as root has
      been severed right at the top of the relevant tmcd calls.
      
      Calls affected:
      
      manifest ('blackbox' and 'useronly' taintstates)
      rpms ('blackbox' and 'useronly' taintstates)
      tarballs ('blackbox' and 'useronly' taintstates)
      blobs ('blackbox' and 'useronly' taintstates)
      startupcmd ('blackbox' taintstate)
      mounts ('blackbox' taintstate)
      programs ('blackbox' taintstate)
      
      Taint handling for the 'accounts' call was dealt with in a prior commit.
      d9c27fac
  14. 14 Nov, 2014 1 commit
  15. 11 Nov, 2014 1 commit
    • Mike Hibler's avatar
      Attempt to prevent progmode capture from hanging on program death. · 07a25b09
      Mike Hibler authored
      I was attempting to read back any last words the program might have
      uttered, but if it said nothing, we would hang. I would not have
      expected this behavior from a pipe (actually, socketpair) when the
      other end has gone away! But, make it non blocking before we read
      to be safe.
      07a25b09
  16. 10 Nov, 2014 1 commit
    • Mike Hibler's avatar
      Fix Linux MFS issue. · 254d0d6d
      Mike Hibler authored
      When locating the root device, if a BSD disk partition fills the entire
      DOS partition, then Linux will not create a separate /dev entry for it.
      In that case, we use the DOS partition device.
      
      Also, a couple of changes to resync with BSD slicefix.
      254d0d6d
  17. 09 Nov, 2014 1 commit
  18. 08 Nov, 2014 1 commit
  19. 07 Nov, 2014 2 commits
    • Mike Hibler's avatar
      The latest in logic to have findSpareDisks not use the system disk. · 2eab9b24
      Mike Hibler authored
      If an available partition device (aka, the 4th partition on the system disk)
      represents less than 5% of the spare space we have found, ignore it.
      
      This will allow us to continue to use the 4th partition on the system
      disk of the d710s (450GB or so) and the second disk (250GB), but not use
      the 2nd partition (3GB), which would make us thrash about on the system
      disk even more than usual.
      
      Mostly this is for the new HP server boxes, so it doesn't pick up the 10GB
      left over on the (virtual) system disk when we have 21TB available on the
      second (virtual) disk.
      
      Another hack til blockstores rule the world...
      2eab9b24
    • Mike Hibler's avatar
      Fix for CentOS. Liberalize an RE. · 2472e72a
      Mike Hibler authored
      2472e72a
  20. 05 Nov, 2014 1 commit
  21. 23 Oct, 2014 1 commit