1. 24 Sep, 2012 1 commit
    • Eric Eide's avatar
      Replace license symbols with {{{ }}}-enclosed license blocks. · 6df609a9
      Eric Eide authored
      This commit is intended to makes the license status of Emulab and
      ProtoGENI source files more clear.  It replaces license symbols like
      "EMULAB-COPYRIGHT" and "GENIPUBLIC-COPYRIGHT" with {{{ }}}-delimited
      blocks that contain actual license statements.
      
      This change was driven by the fact that today, most people acquire and
      track Emulab and ProtoGENI sources via git.
      
      Before the Emulab source code was kept in git, the Flux Research Group
      at the University of Utah would roll distributions by making tar
      files.  As part of that process, the Flux Group would replace the
      license symbols in the source files with actual license statements.
      
      When the Flux Group moved to git, people outside of the group started
      to see the source files with the "unexpanded" symbols.  This meant
      that people acquired source files without actual license statements in
      them.  All the relevant files had Utah *copyright* statements in them,
      but without the expanded *license* statements, the licensing status of
      the source files was unclear.
      
      This commit is intended to clear up that confusion.
      
      Most Utah-copyrighted files in the Emulab source tree are distributed
      under the terms of the Affero GNU General Public License, version 3
      (AGPLv3).
      
      Most Utah-copyrighted files related to ProtoGENI are distributed under
      the terms of the GENI Public License, which is a BSD-like open-source
      license.
      
      Some Utah-copyrighted files in the Emulab source tree are distributed
      under the terms of the GNU Lesser General Public License, version 2.1
      (LGPL).
      6df609a9
  2. 13 Jan, 2011 1 commit
  3. 19 Oct, 2010 1 commit
  4. 11 Oct, 2010 1 commit
    • Leigh Stoller's avatar
      Work on an optimization to the perl code. Maybe you have noticed, but · 92f83e48
      Leigh Stoller authored
      starting any one of our scripts can take a second or two. That time is
      spent including and compiling 10000s of thousands of lines of perl
      code, both from our libraries and from the perl libraries.
      
      Mostly this is just a maintenance thing; we just never thought about
      it much and we have a lot more code these days.
      
      So I have done two things.
      
      1) I have used SelfLoader() on some of our biggest perl modules.
         SelfLoader delays compilation until code is used. This is not as
         good as AutoLoader() though, and so I did it with just a few 
         modules (the biggest ones).
      
      2) Mostly I reorganized things:
      
        a) Split libdb into an EmulabConstants module and all the rest of
           the code, which is slowly getting phased out.
      
        b) Move little things around to avoid including libdb or Experiment
           (the biggest files).
      
        c) Change "use foo" in many places to a "require foo" in the
           function that actually uses that module. This was really a big
           win cause we have dozens of cases where we would include a
           module, but use it in only one place and typically not all.
      
      Most things are now starting up in 1/3 the time. I am hoping this will
      help to reduce the load spiking we see on boss, and also help with the
      upcoming Geni tutorial (which kill boss last time).
      92f83e48
  5. 20 Aug, 2009 2 commits
  6. 02 Mar, 2009 1 commit
  7. 30 Nov, 2006 1 commit
    • Kevin Atkinson's avatar
      · 1253f479
      Kevin Atkinson authored
      IO::Handle::opened method doesn't work when a ref to STDOUT/ERR are stored in
      a variable since it really a glob in perl 5.005.  IN 5.8 it works for some
      reason.  To fix use IO::Handle::opened($$this) not $$this->opened()
      1253f479
  8. 06 Nov, 2006 1 commit
    • Kevin Atkinson's avatar
      libaudit related changes: · e89ee617
      Kevin Atkinson authored
        - Added "LIBAUDIT_FANCY" option to AuditStart.  When this option is
          used libaudit will send a different email than it normally sends,
          and on error call tblog_find_error() to determine the error.
      
        - Also add audit function AddAuditInfo which adds add additional
          information for libaudit to use in SendAuditMail when AUDIT_FANCY
          is set.
      
        - Modify template_swapin, template_instantiate, and template_create
          to use the new audit functionality.
      
        - Suppressing calling tblog_find_error and sending the error email
          when auditing in swapexp and batchexp
      
      tblog changes:
      
        - Shorten the message sent to the user when the error in unknown.
          Remove all parts about lack of free nodes as it no longer really
          applies as tblog now correctly identified those errors and handles
          them separately.  The message is now just "Please look at the log
          below to see what happened."
      
        - Improve algo. used to determine the other error when canceled.
          Will now work by removing all errors related to the cancel request
          and the essentially rerunning tblog_find_error.  If the cause of
          the error is still canceled, repeat and try again until the cause
          is something other than canceled or no errors are left.
      
        - Refactor tblog_find_error, which involves creating new internal
          functions: tblog_determine_single_error, tblog_store_error,
          tblog_dump_error
      
        - Add section on Primary vs Secondary Errors to the inline POD
          documentation.
      
        - Other minor enhancements and bug fixes.
      e89ee617
  9. 26 Oct, 2006 1 commit
    • Kevin Atkinson's avatar
      Various tblog changes: · 95a3a6a7
      Kevin Atkinson authored
        Make an attempt to discover what the error was before an swap-* was
        canceled, if any.  Both the main error (canceled) and the other error
        are stored in the error table.  To support this a new column in the
        error table is added "rank".  The primary error has a rank 0 while the
        other error has a rank 1.
      
        Make an attempt to determine when an error is a "me too" error or the
        real cause of the problem.  "Me too" errors are errors which are
        generally reported when the callie script determined that the caller
        script fails.  The caller script should have reported the error, but
        in some cases the error didn't make it into the database.  Thus if a
        "me too" is reported as the cause of a "swap-*" more info is needed to
        determine the true cause.  When a "me too" error is reported it is
        followed by a "..." on it's own line.  It is also recorded in the
        errors table under the new column "need_more_info".
      
        Add inferred column to the errors table.  This is the same value as
        the inferred variable in tblog_find_error.
      
        Add revision column to errors table to make it easy to tell which
        algorithm was used to determine the error.
      95a3a6a7
  10. 27 Sep, 2006 1 commit
    • Kevin Atkinson's avatar
      · 7293bbc0
      Kevin Atkinson authored
      Second attempt to fix the problem of duplicate log entries.  I am
      99.99% sure this will get 100% of the cases, and 99.999% sure it won't
      break anything.
      
      It basically detects when the DB handle is a child and if so set
      "InaciveDestroy" before the database handle DESTROY method is called.
      Since the DB handle can be closed in several different places I created a
      new class to override the Db Handle (the Mysql class) DESTROY method. The
      other alternative is to add special code anywhere where the database handle
      could be destroyed which is when every a reconnect is done and when the
      module exists.  The later would have involved putting code in the END block.
      I think the new class method is simpler for that reason.
      
      
      Also, add a note about patching Mysql.pm in doc/UPDATING.
      7293bbc0
  11. 31 Aug, 2006 1 commit
    • Kevin Atkinson's avatar
      · 964b8d11
      Kevin Atkinson authored
      Add patch to modify Mysql.pm to allow setting the "InactiveDestroy" in
      the underlying DB handle.  Also avoid disconnecting the file handle
      explistly on DESTROY as that will be taken care of in the DESTROY
      method for the the DB handle.
      
      Override perl version of fork() to set InactiveDestroy in all open
      database handles in the child so that it won't send a disconnect when
      the handle is destroyed as this will also close the database handle
      for the parent.  It will also call tblog_new_child_process in the
      child process to properly inform tblog of the new process. This will
      be a NoOp if the libtblog module is not loaded.
      964b8d11
  12. 25 Aug, 2006 1 commit
    • Kevin Atkinson's avatar
      · 312021d4
      Kevin Atkinson authored
      More tbreport changes from Mike Kasick <mkasick@andrew.cmu.edu>:
      
      - Added tblog support to nscheck.
      
      - Added ns_parse_failed error to nscheck.
      
      - Added invocation column to report_errors to differentiate between assign
        runs in infeasible resource assignments.
      312021d4
  13. 16 Aug, 2006 1 commit
    • Kevin Atkinson's avatar
      - Added tbreport database schema (added three tables), storage for · 9c5d3308
      Kevin Atkinson authored
        tbreport errors & context.
      
      - Modified fatal() in swapexp, batchexp, and tbprerun, and die_noretry()
        in os_setup to pass hash parameter to tblog functions.
      
      - Added tbreport errror & context information for select errors in
        swapexp, tbswap, assign_wrapper2, snmpit_lib, snmpit, batchexp,
        assign_wrapper, os_setup, parse-ns, & tbprerun.
      
      - Added assign error parser in assign_wrapper2.
      
      - Added parse.tcl error parser in parse-ns.
      
      - Added severity constants for tbreport in libtblog_simple.
      
      - Added tbreport() function & context table mappging for reporting
        discrete error types to libtblog.
      9c5d3308
  14. 14 Aug, 2006 1 commit
    • Kevin Atkinson's avatar
      · 07dda0d8
      Kevin Atkinson authored
      Prep for Mike Kasick report code.  Updated database schema and
      installed hooks for his code.
      
      Cleaned up how errors were handled in tblog(...).
      
      Allow SENDMAIL to be called before the path is untained in '-T' scripts.
      
      Other small changes.
      07dda0d8
  15. 20 Jul, 2006 1 commit
    • Kevin Atkinson's avatar
      · 5710c340
      Kevin Atkinson authored
      Various tblog changes:
      
      Added message about recovery action when a swap-modify failed to the
      top of the email.
      
      Fine tuned os_setup summary error.  Added (possible partial) list of
      nodes that fail; if a large number fail only show as many that will
      fit on a single line.  Other tweaks.
      
      Flagged assign_wrapper errors of an Invalid OS as user errors.
      5710c340
  16. 05 Jul, 2006 1 commit
    • Kevin Atkinson's avatar
      Many changes to tblog code. Database update needed: · 183040de
      Kevin Atkinson authored
      1) Added summary of failed nodes is os_setup.  The cause of the error is now
      classified as "user" if it is only user images that failed and the user
      image failed on every pc of a particular type.  Otherwise I leave the cause
      as "unknown" since it is really hard to tell what the real cause is.
      
      2) Raised the confidence threshold for most errors so that they will appear
      on the top.
      
      3) Added a special error when an experiment is canceled.  The cause is
      "canceled" and testbed-ops won't see these errors.
      
      4) Fixed a bug in assign_wrapper where it will incorrectly report "This
      experiment cannot be instantiated on this testbed..." when really the user
      canceled the swapin.
      
      5) Fixed a bug where os_setup errors where being incorrectly reported as
      assign errors.  This happens when os_setup fails for some reason and
      tbswap tries again, but the second time around there are not enough nodes.
      So the last error is coming from assign even though the true cause of the
      error is due to failed nodes.  The fix for this involved added a new column
      to the log table, "attempt", which will be 1 for the first attempt and then
      incremented for each new attempt.  tblog_find_error will then simply ignore
      any errors with "attempt > 1".
      
      6) Also fixed a potential problem when there is an error during the cleanup
      phase by adding another column "cleanup".  tblog_find_error will
      also ignore any errors with the cleanup bit set.
      183040de
  17. 29 May, 2006 1 commit
  18. 08 May, 2006 1 commit
    • Kevin Atkinson's avatar
      · 95f529d3
      Kevin Atkinson authored
      Refactor "log" table to move some stuff into a new table.
      95f529d3
  19. 27 Mar, 2006 1 commit
    • Kevin Atkinson's avatar
      · d8625ddd
      Kevin Atkinson authored
      Change the email going to testbed-errors from:
      
        From: user
        To: user
        Cc: testbed-errors
      
      to
      
        From: testbed-ops
        To: user
        Bcc: testbed-errors
        X-NetBed-Cc: testbed-errors
      
      This should cause all replies to these message to go to testbed-ops
      instead of testbed-errors.  The only thing is that you need to be
      careful when replying since if you only reply to the sender than it
      will go to testbed-ops only and not to the user.  I was also thinking
      about changing the other swap* failures messages still going to
      testbed-ops in a similar way but due to the reply issue for those on
      testbed-ops I will hold off on that for now unless someone else thinks
      it's a good idea.
      
      The addition of the X-NetBed-Cc header is to make filtering the email
      easier since the fact that it is going to testbed-errors will no
      longer be in the header.  This header is also present in the swap*
      failures messages still going to testbed-ops.
      d8625ddd
  20. 24 Mar, 2006 1 commit
    • Kevin Atkinson's avatar
      · bcbd18aa
      Kevin Atkinson authored
      Hhave swapexp/batchexp dump the error when the -w" option is
      specified.  The error will look something like:
      
        ERROR:: <cause desc>
      
        <text of the error>
      
        Cause: <cause>
        Confidence: <confidence>
      
      This will be the last thing printed.  The "::" is there to make
      recognizing the error easy to scripts since they can just look for the
      "ERROR::".
      bcbd18aa
  21. 23 Mar, 2006 1 commit
    • Kevin Atkinson's avatar
      · d3ca9c2d
      Kevin Atkinson authored
      Change @TBBASE@ to @TBDOCBASE@ when refering to KB entry.
      d3ca9c2d
  22. 21 Mar, 2006 2 commits
    • Kevin Atkinson's avatar
      · 1fa07472
      Kevin Atkinson authored
      Fix bug causing strange errors from snipit due to an invalid assumtion
      about __DIE__ handler in libtblog.pm.in.
      1fa07472
    • Kevin Atkinson's avatar
      · d258dde6
      Kevin Atkinson authored
      Changed format of email sent to user on errors.  The error will now
      appear instead of the generic message when I am confident it is
      accurate.  The subject line will also change to reflect the cause of
      an error.
      
      Avoid sending mail to testbed-ops during failed swap related evenets
      in some cases.  It will instead be sent to a new mailing list
      testbed-errors.
      
      Added a new row in the experiment info table "Last Error:" which
      states the cause of the error, and links to a new page displaying the
      error.
      
      Made some assign/assign_wrapper errors more informative.
      
      The error (as determined by tblog) is now stored in the database in a
      more structured fashion.  This inlcudes adding a column for the session
      (in the log table) to testbed_stats to link eash swap event with the
      logs and possible the error.
      
      Other changes to the database, see sql/database-migrate.txt
      d258dde6
  23. 13 Mar, 2006 1 commit
    • Kevin Atkinson's avatar
      · 6e488e77
      Kevin Atkinson authored
      Added inline POD documentation of libtblog.
      
      TODO: Automatically generate HTML page from and and have it installed
      with the other HTML docs.
      6e488e77
  24. 23 Feb, 2006 1 commit
    • Kevin Atkinson's avatar
      · 10fd4a08
      Kevin Atkinson authored
      Fix tied STDIN and STDOUT with perl 5.8 in libtblog.
      
      Fix a phototype warning in  os_load.in
      10fd4a08
  25. 21 Feb, 2006 1 commit
    • Leigh Stoller's avatar
      Neuter the perltie stuff under perl 5.8 since it does not work properly. · 85488e5b
      Leigh Stoller authored
      I got close to getting it to work by adding this:
      
      	sub FILENO  { my $this = shift; fileno($$this) }
      	sub CLOSE   { my $this = shift; close($$this) }
      
      	sub OPEN {
      	    my $this = shift;
      
      	    close($$this) if defined(fileno($$this));
      	    @_ == 1 ? open($$this, $_[0]) : open($$this, $_[0], $_[1]);
      	}
      
      But subprocesses were not seeing the right stdout/stderr after doing
      something like:
      
      	open(STDERR, ">> $logname");
      	open(STDOUT, ">> $logname");
      
      So, I will let Kevin work on it; I've spent too much time on it
      already!
      85488e5b
  26. 09 Feb, 2006 1 commit
  27. 26 Jan, 2006 1 commit
    • Kevin Atkinson's avatar
      · 05015359
      Kevin Atkinson authored
      Merged in changes from tblog-2-branch:
      
                Move parts of libtblog into libtblog_simple.  Libtblog simple
                provided the basic logging functions but doesn't touch anything.
                Moreover including libtblog_simple doesn't automatically start
                the logging subsystem.  It also doesn't have testbed dependencies
                which mean 1) it can be used in the core testbed libraries (such
                as libdb, libtestbed) without introducing a circular dependency
                and 2) can be used independently.
      
                Reworked DBFatal and DBWarn to use tblog.  It will still email
                testbed-ops, however.
      
                Make use of the "cause" field to determine the cause of the bug.
                In particular tblog_find_error will look at the value of this
                field and report the "cause".  In the future different actions
                can be taken based on the ultimate "cause" of the bug, such as if
                testbed-ops should be notified.
      
                Change format of Error Message reported by libtblog.  As per the
                email "Format or Error Messages" ro testbed-dev.
      
                Have libtblog use its own Database handle to avoid problems with
                locked tables.
      
                Also set DBCONN_MAXTRIES to 3 for most important queries.  For
                queries that are not important don't send mail on error.
      05015359
  28. 19 Dec, 2005 1 commit
    • Kevin Atkinson's avatar
      · 45f997fd
      Kevin Atkinson authored
      Updates to to Error Logging API Code.
      
      You should start seeing much better error messages coming from my
      system.  Errors coming from parse.proxy and assign (the two most
      frequent sources of errors) should now be concise and to the point.
      Errors coming from libosload/libreboot (the next most frequent source
      of errors) should now also be much better, but not perfect.  Getting
      perfect errors will likely a rework of how errors are handled in
      libosload/libreboot, just adding tberror/tbwarn/tbnotice calls is not
      enough.  I can do this at a latter date if necessary.
      
      A few minor database changes.
      
      Some changes to the API.  A few bug fixes. Lots of tberror/tbwarn/tbnotice
      added to scripts.
      
      Since assign is a C program, and at this time my API is perl only, I wrote a
      second wrapper around assign, assign_wrapper2.  When assign fails errors are
      now parsed in assign_wrapper2, sent to stderr and logged.  This means that
      RunAssign() just returns when assign fails rather than echoing some of
      assign.log output and then quiting.  The output to the activity log remains
      unchanged.
      
      Since "parse.proxy" is run from ops I couldn't use my API in it, even though
      it is a perl program.  Instead I parse the errors coming form it in
      parse-ns.
      45f997fd
  29. 04 Nov, 2005 1 commit