Skip to content
  • Mike Hibler's avatar
    Fix a bug when expanding the list of images implied by delta images. · a62de08a
    Mike Hibler authored
    I was expanding a global list in a loop for every node. So for each
    node, I was finding all the delta images in the ever-growing list and
    adding their dependencies (again!) making the list even larger. In an
    experiment loading a two-level delta image on 8 nodes, the list included
    40+ copies of the same three images to load by the time we got to the last
    node. However, no node attempted to load all those images because tmcd
    exceeded its reply buffer size on the "loadinfo" call and would not
    return anything. Of course, by then we had computed a max wait time
    based on image.max_wait * 45 so the experiment suffered a slow, lingering
    death even though the nodes were not doing anything.
    
    Beware, I do not know if I got the "access key" code right for remote
    nodes. Not even sure if we use that path anymore. I attempted to fix it
    in libosload, I did not even try in libosload_new.
    a62de08a