-
David Johnson authored
A long time ago, I had punted on how to handle the PARTIALSYM optimization for the cases where 1) inlined symbols reference their origin symbols; and 2) where symbols reference their datatypes. Basically, if we are loading just a single symbol, but it references something else, we can't just jump to its offset in the file and load it and link it into the AST immediately, because when we jump to the offset, we don't know where in the nesting hierarchy we are. In practice this isn't a problem for C datatypes (and probably not usually C++ datatypes either), because types tend to be global and at level == 1 (level == 0 is the "root" CU symbol for a single source file). That meant our hack to just jump to the referenced DIE, and assume we're at level 1, was "correct". However, it was definitely *not* correct for the inlined origin symbols (think function params, which would be at level == 2 (at least, if not deeper) -- and to make sure they are correctly inserted into the hierarchy, we must make sure we have loaded the prior level == 1 parent symbol; at that point, we are sure the hierarchy is correct. This is harder than it sounds; hence the punt. Anyway, I got really tired of waiting 20+ seconds for a 165MB debuginfo file to fully load, so I started using PARTIALSYM all the time. But then I tried to backtrace through inlined functions or something, and watched the world crumble to pieces). So I fixed this up. Here's the key comments; commit messages are fun but documentation awaits: * Make a range-searchable list of the top-level die offsets in * this CU. We have to do this because if we load DIEs out of * order, one DIE may reference another DIE that is in a * different parent hierarchy -- and we might not have loaded * that parent hierarchy! We could throw in all kinds of * optimizations to try to figure out exactly how many DIEs in * that hierarchy we have to load, but since dwarf_load_cu isn't * well-ordered towards loading individual DIEs and reconciling * their parent hierarchy *after* loading, what we will do is, * when loading a particular referenced DIE, we will fully load * the top-level die containing it. Again, this is a balance * between simplicity of implementation and runtime speed for * the PARTIALSYM case. * * We create this list like this. If PARTIALSYM, we build it up * as we come across new top-level DIEs during our in-order * traversal. If not PARTIALSYM, and (CUHEADERS || PUBNAMES), * we pre-scan the top-level DIEs IFF they have sibling * attributes (but if there are no sibling attributes, we will * just have to expand the whole CU to the one DIE we need!). * * Then, when we want to load a partial symbol at DIE offset X, * we just find the previous top-level DIE to load! and * If we're doing a partial CU load (i.e., loading specific DIE * offset within this CU), any other symbols referenced by this * symbol need to get appended to our DIE load list if we haven't * already loaded them! * We can't do this for inlined formal params, vars, or labels, * because we will jump into the middle of a subprogram, without * knowing we are in that subprogram DIE -- so our symtab * hierarchy will be screwed up! * * Ok, now we *can* do this, because our symbol expander will * load the containing top-level symbol even if the symbol is a * param, var, or label that is not at level 1! (I had written some snarky text about how hard this is, but I'll just say I wish somebody would pay me so I could refactor it into chunks that actually made sense. But I'm not sure if that's possible, and it might be slower anyway.)
7d27a009