Skip to content
  • Mac Newbold's avatar
    Bunch of pretty good-sized changes to stated: · b438d5f5
    Mac Newbold authored
    1. Change from inefficient timeout search algo that ran once per second to
    a highly efficient priority queue method of managing timeouts. Now
    instead of checking every node's timestamps, we just look at the head of
    the queue, and it is often much less frequent than once a second, since we
    know how long we have until the next timeout.
    2. Start using a blocking poll for events, so I can sleep for long periods
    of time instead of having to wake up at least once a second to check for
    timeouts and events. Will set the block timeout for the shortest of: the
    time to send out the next batch of queued emails, the next time a timeout
    may occur, or when there are no mails waiting and no timeouts possible, 10
    minutes. Comes back as soon as an event comes in.
    3. Given the above two items, we no longer need a sleep(1) in our main
    One small glitch is in the progress of being fixed. When using blocking
    polls, things hang when trying to unregister from the event system. Not a
    big deal, just ^C twice to kill it. (May cause it to need two SIGUSR1's
    to get it to restart, too.)
    In the next update, look for:
     - Really take action on timeouts.
       - keep track of how many times we've retried, and notify if something
         may be wrong with the node.
       - Find out policy on taking action with timeouts.
         - Do it if the expt is in transition or the node is free
         - Probably don't touch if the expt is established.
         - Maybe? in active expt, send (good) email to expt owner on timeouts
    Related "coming soon" items:
    os_load/os_setup etc.:
     - Add the waitforstate stuff we've talked about
     - make os_load/os_setup use it