Skip to content
  • Leigh B. Stoller's avatar
    Largish rework of nfree. Started out that I just wanted to map the · 895a44f6
    Leigh B. Stoller authored
    default OSID from the node_types table, to a specific OSID from the
    partition table on the actual node. This is to avoid setting the boot
    OSID to RHL_STD when the node is released, which causes a boot
    failure. Okay, so I added a library routine to do this (yanked out of
    os_setup where I did the code originally). This would solve most of
    the problems, except where there was no OS loaded that would satisfy
    the mapping, in which case the user must have done an os_load, and now
    that auto schedules a reload. Anyway, seemed like this should work.
    Ha! Mysql locking is downright dumb; all tables used within a lock
    region must be locked. nfree was already locking 9 tables, and in
    order to call out to library routines (which might use anything) I
    would have to lock the world, which is not actually possible anyway.
    Why all this locking in nfree in the first place? The idea is that
    there is a race between releasing the node from reserved, and cleaning
    up all those tables (interfaces, delays, nodes, etc). We don't want to
    free a node, and have it get allocated to another experiment before
    the cleanup is done, since that would mess up the state of the node.
    The solution (albiet a crufty one) was to lock just the reserved table
    (which guards against multiple people trying to nfree the same node at
    the same time) and switch the reservation out of the pid,eid and into
    a holding reservation. This effectively removes the node from the
    users control, but keeps it reserved. Then I unlock the reserved
    table. With that done, I can clean up all those tables without any
    locking, since the node is still reserved. After cleanup, I can either
    delete the reservation, or move it to the next reserve or reload
    reservation if those were pending. No locking is needed at this point
    since single table changes are atomic (and nalloc locks reserved
    anyway). Okay, so now we sit back and see if this was a good idea.
    895a44f6