1. 19 Jun, 2012 1 commit
    • Mike Hibler's avatar
      Make frisbee more directly IGMP (v2) aware. · 66e07584
      Mike Hibler authored
      Add "-Q <interval>" option to the master server to allow it to act as an
      IGMP V2 querier in environment where there is otherwise not one. It does
      essentially what the perl-based querier (code.google.com/p/perl-igmp-querier/)
      does, sending out a v2 membership query at the specified interval.
      
      This eliminates the need to run mrouted in some environments (e.g., elabinelab)
      just to issue IGMP queries. As a result, all the boss-install and elabinelab
      setup related to using mrouted to perform this function has been removed.
      The elabinelab CONFIG_MROUTED option has been changed to CONFIG_QUERIER
      (the former is still recognized and mapped to the latter). The undocumented
      defs-* variable NEEDMROUTED has been changed to NEEDMCQUERIER (the former
      still exists in install/installvars.pm.in but is always set to 0) to more
      accurately reflect the variable's purpose. If NEEDMCQUERIER is set, then
      the mfrisbeed startup script is modified to add the "-Q 30" option.
      
      The implementation of the client and server "-K <interval>" keep-alive option
      has been changed to directly send IGMP v2 membership reports containing the
      associated MC address.
      
      Note that the -K options have always been a hack to work-around assorted
      IGMP-related misconfigurations and incompatibilities, and really should
      only be used as a last resort. As implemented, they could cause the host
      machine to be pruned out of other MC groups at the nearest switch since
      they only report membership in the frisbee MC group. With the master server
      acting as an IGMP querier, instances of the frisbee server on that host
      should no longer need to do keep alives. We still have one case where it
      is needed on the client-side: a FreeBSD 8.x or later host connected to an
      IGMPv2-only switch. It appears that the IGMPv3 implementation added in
      FreeBSD 8.x always sends v3 reports, even when the default is configured
      (via sysctl or even recompiling the kernel) as v2.
      66e07584
  2. 20 Mar, 2012 1 commit
  3. 10 Oct, 2011 1 commit
  4. 22 Sep, 2011 1 commit
  5. 12 Sep, 2011 1 commit
  6. 30 Aug, 2011 1 commit
  7. 26 Aug, 2011 2 commits
  8. 23 Aug, 2011 1 commit
  9. 16 Aug, 2011 1 commit
  10. 27 Jul, 2011 1 commit
  11. 02 Jun, 2011 1 commit
  12. 18 May, 2011 1 commit
    • Mike Hibler's avatar
      Support image PUT (aka, "upload") and assorted minor changes. · 77dbad39
      Mike Hibler authored
      1. Support for PUT.
      
      The big change is support for uploading via the master server, based heavily
      on the prototype that Grant did. Currently only host-based (IP-based)
      authentication is done as is the case with download. Grant's SSL-based
      authentication code is "integrated" but has not even been compiled in.
      
      The PUT protocol allows for assorted gewgaws, like specifying a maximum size,
      setting a timeout value, returning size and signature info, etc.
      
      There is a new, awkwardly-named client utility "frisupload" which, like the
      download client, takes an "image ID" as an argument and requests to upload
      (PUT) that image via the master server. As with download, the image ID can
      be either of the form "<pid>/<emulab-image-name>", to upload/update an actual
      Emulab image or it can start with a "/" in which case it is considered to
      be a pathname on the server.
      
      On the server side, the master server takes PUT requests, verifies permission
      to upload the image, fires up a separate instance of an upload daemon (with
      the even catchier moniker "frisuploadd"), and returns the unicast addr/port
      info to the client which then begins the upload. The master server also acts
      as a traffic cop to make sure that downloads and uploads (or uploads and
      uploads) don't overlap.
      
      This has been integrated into the Emulab "create image" process in a
      backward-compatible way (i.e., so old admin MFSes will continue to work).
      Boy, was that fun. One not-so-desirable effect of this integration is that
      images now traverse our network twice, once to upload from node to boss and
      once for boss to write out the image file across NFS to ops. This is not
      really something that should be "fixed" in frisbee, it is only "undesirable"
      because we have a crappy NFS server.
      
      What has NOT been done includes: support of hierarchical PUT operations
      (we don't need it for either the elabinelab or subboss case), support for
      uploading standard images stored on boss (we really want something better
      than host-based authentication here), and the aforementioned support of
      SSL-based authentication.
      
      2. Other tidbits that got mixed in with PUT support:
      
      Added two new site variables:
          images/frisbee/maxrate_std
          images/frisbee/maxrate_usr
      which replace the hardwired (in mfrisbeed and frisbeelauncher before that)
      bandwidth limits for image download. mfrisbeed reads these (and the
      images/create/* variables) when it starts up or receives a HUP signal.
      These could be read from the DB on every GET/PUT, but they really don't change
      much and I needed something to test the reread-the-config-on-a-HUP code!
      
      Fixed avoidance of "problematic multicast addresses" so it would actually
      work as intended.
      
      Lots of internal "refactoring" to make up for things I did wrong the first
      time and to give the general impression that "Wow, Mike did a LOT!"
      77dbad39
  13. 02 Feb, 2011 1 commit
    • Mike Hibler's avatar
      Various changes related to the port mfrisbeed listens on. · d77c33b7
      Mike Hibler authored
      1. Add -i option to tie mfrisbeed to a particular interface and use that
         option for a multi-homed boss node (only listen on inside interface).
      
      2. Make that option play nice with the localhost proxying features of a
         previous commit: we also listen on localhost in this case.  The localhost
         proxying feature now has the compile time option USE_LOCALHOST_PROXY
         rather than mixing it up with the USE_EMULAB_CONFIG option (even though
         it probably will only ever be used in the Emulab config).
      
      3. Both of those were really to fix one bug: if we got a request from
         localhost to start up a frisbeed daemon, we were starting it up bound
         to the localhost interface.  Now we will make sure the server gets
         started up using the "other" interface, i.e., the one specified with -i.
      d77c33b7
  14. 01 Feb, 2011 1 commit
    • Mike Hibler's avatar
      Implement limited backward compatibility with the old frisbee setup. · 1017ccce
      Mike Hibler authored
      The big backward compatibility issue is that we no longer store running
      frisbeed info in the DB.  This means that loadinfo could not return
      address:port info to clients and thus old frisbee MFSes could no longer
      work.  While not a show stopper to require people to update their MFS first,
      I made a token effort to implement backward compat as follows.
      
      When an old frisbee MFS does "tmcc loadinfo" (as identified by a tmcd
      version < 33), tmcd will invoke "frisbeehelper" to startup a daemon.
      Sound like frisbeelauncher?  Well sorta, but vastly simplified and I only
      want this to be temporary.  The helper just uses the frisbee client to make
      a "proxy" request to the localhost master server.  The Emulab configuration
      of the master server now allows requests from localhost to proxy for another
      node.
      
      frisbeehelper is also used by webfrisbeekiller to kill a running daemon
      (yes, just like frisbeelauncher).  It makes a proxy status request on
      localhost and uses the returned info to identify the particular instance
      and kill it.
      1017ccce
  15. 17 Jan, 2011 1 commit
    • Mike Hibler's avatar
      Random Windows (Cygwin) fixes. · 975cc86e
      Mike Hibler authored
      Client-side builds again, but haven't got node to boot correctly.
      Need to get pubsubd installed correctly as a service in place of elvind.
      975cc86e
  16. 13 Jan, 2011 1 commit
  17. 11 Jan, 2011 1 commit
    • Mike Hibler's avatar
      More work toward getting this working on subboss. · 8d80301e
      Mike Hibler authored
      More work on the hierarchical configuration for subboss. When doing host-based
      authentication, allow client to pass an explicit host (IP) to the mserver.
      If the mserver is configured to allow it, that IP is used for authenticating
      the request instead of the caller's IP. Add a default ("null") configuration
      so the mserver can operate out-of-the-box with no config file. The goal of
      these two changes is for an mserver instance with the default config and a
      proxy option to serve the needs of a subboss node (i.e., so no explicit
      configuration will be needed).
      8d80301e
  18. 13 Dec, 2010 1 commit
    • Mike Hibler's avatar
      Aerformed. If the mserver receiving the request allows the caller to be a · c5b7cceb
      Mike Hibler authored
      proxy, then it will perform its checks against the hostIP provided rather
      than the IP of the message sender.  For the Emulab subboss case, subboss
      nodes (as determined from the DB) are allowed to explicitly specify hosts
      that are under their control.  The master server on real boss will run with
      proxying enabled.
      
      Also, in a fit of madness, I added a version number to the master server
      protocol.  This way, if at some distant point in the future (say next week)
      I realize I screwed up the protocol, I can fix it without resorting to creative
      retrofitting of a version number (see imagezip) or the more Orwellian
      eradication and denial of past versions (what I am doing now). Furthermore,
      using an ill-thought-out insight, I made the version number be an ASCII string
      in case I decide to change to an all-text protocol at some equally distant
      point in the future.
      c5b7cceb
  19. 03 Dec, 2010 1 commit
    • Mike Hibler's avatar
      Hierarchical downloading is now implemented. · 21b01bbb
      Mike Hibler authored
      The master server can be started with a "parent" mserver that it can call
      if it doesn't have an image.  The master server will return an EAGAIN type
      error to any clients that contact it while it is downloading an image from
      its parent.  The client now has a "-B N" option to tell it to try again
      every N seconds as long as it gets back that error.  This is the "store and
      forward" mode.  The mserver also has a "cut through" style where it will
      return to clients the mcast info it got from its parent so they can download
      directly from the parent until the local mserver has it.
      21b01bbb
  20. 24 Nov, 2010 1 commit
    • Mike Hibler's avatar
      First crack at a frisbee "master server" for handling GET (download) requests. · a2a896ab
      Mike Hibler authored
      There are a couple of new packet types in the frisbee protocol which are
      exchanged via TCP with the master server: GETREQUEST and GETREPLY.  The
      client passes to the master server an opaque imageid and a couple of options
      and gets back the addr/port to use to actually download the image.  The
      implementation of the master server is fragile and is more of a test
      framework, Grant is working on a more robust master server.  I am mostly
      doing a backend that communicates with the Emulab DB to do its authentication
      and making the client changes.
      
      The client now uses the -S option to specify the IP address of the master
      server and the -F option to specify an imageid.  If no error is returned,
      the image is downloaded using the returned addr/port.  If -Q is used in place
      of -F, then the client makes a "status only" call getting back info about
      whether the named image is accessible to the client and whether a server is
      currently running.
      
      On the server side, the new master server (mserver.c) has an Emulab
      configuration "backend" that supports host-based authentication.
      The IP address of the caller is mapped to a node_id/pid/gid/eid combo
      that is used to determine access.  On a request, the specified imageid is
      treated either as a pathname (if it starts with '/') or an image identifier
      of the form "<pid>/<imagename>".  If it is a pathname, we check to make
      sure that pathname (after running through "realpath") is contained in one
      of the directories accessible to that node in its current experiment context;
      i.e., /share, /proj/<pid>, /groups/<pid>/<gid>, or /users/<swapper-uid>.
      If it is an image identifier, the DB is queried to ensure that access is
      allowed to that image; i.e., it must be "global" or in the appropriate
      project/group.
      
      The master server forks a frisbeed for each valid request, if one is not
      already running.  The multicast address selection is still based on the
      emulab_indicies.frisbee_index field, but the address/port/server info is no
      longer stored in the frisbee_blobs table (frisbee_pid, load_address,
      load_busy are not set).
      
      Note that this is not yet integrated in the os_load path.  Further work is
      required to replace frisbeelauncher.
      a2a896ab
  21. 02 Jul, 2010 2 commits
  22. 28 Sep, 2009 1 commit
    • Mike Hibler's avatar
      Changes: · 8fd4b67e
      Mike Hibler authored
      Support for jumbo packets.  Setting WITH_JUMBO on the make command line
      will change the image block size to 8192 bytes and reduces the number of
      block per chunk to 256 (to maintain the 1MB chunk size for compat with old
      images).  The default is still 1024.
      
      Added the notion of a "dubious" chunk buffer in the client.  If an incoming
      chunk buffer is marked as CHUNK_DUBIOUS, then its contents can be evicted and
      the buffer reused for a more promising chunk.  This is a crude replacement
      mechanism that is currently only used in one place: if we miss part of a
      chunk and the server switches to sending a new chunk for which we have no
      free buffer, we switch to collecting the new chunk.  The reasoning is that
      it will take a while for the server to switch back to completing the former
      chunk, during which time it may send one or more complete chunks that we
      could more fruitfully use (decompress and write out).
      
      Changed the meaning of the "done" field for a chunk.  It used to mean either
      that we have completely processed the chunk or that we are currently collecting
      it.  It took additional work (scanning all chunk buffers) to differentiate
      these cases, so I make it explicit.
      
      Allow the client and server to dynamically determine the maximum socket
      buffer size.
      
      Fix a couple more on-the-wire data structure size/alignment issues that
      showed up on a 64-bit OS.
      
      A few minor speedups to the bitmap handling code.  Think: "rearranging deck
      chairs on the Titanic" here.  We need more serious algorithmic changes
      to scale all this code going forward.
      
      Add some more TRACE events and refine what is already there.
      
      Added some hacks to allow frisbee client/server to run on the same machine.
      We had made it remarkably hard to do this.  But then again, why would you
      want to!  Look for SAME_HOST_HACK in the makefile.
      8fd4b67e
  23. 19 Aug, 2009 1 commit
  24. 25 May, 2007 1 commit
  25. 09 Jan, 2007 1 commit
    • Mike Hibler's avatar
      Frisbee MFS changes: · 346c0562
      Mike Hibler authored
       * support FreeBSD 6
       * client-side changes to support enable/disable of ACPI via slicefix
       * use dynamically linked Emulab binaries in frisbee MFS (for size)
      346c0562
  26. 01 Dec, 2006 1 commit
    • Mike Hibler's avatar
      Bug fixes from Annette DeSchon <deschon@ISI.EDU> and Keith Sklower · 8ea641c1
      Mike Hibler authored
      <sklower@vangogh.CS.Berkeley.EDU> for the following, related to the -z (zero)
      option in imageunzip/frisbee:
      
        1. For the case where a full-disk image is smaller than
           the disk the image is being unzipped onto, we added
           code to zero the area between the end of the image and
           the end of the disk.
      
        2. During the unzipping process, when zeros are being written
           at the end of a chunk, a write() that returned a length
           different from the expected value previously caused an
           infinite loop.  We noticed this problem at ISI, on a number
           of pc733s, which we suspect may have (relatively minor)
           hardware disk problems.
      
      The latter addressed a Mike-o that has existed for 4 years.  Call it failure
      resilient computing or just plain denial, but because of a botched conditional,
      I was ignoring failed writes to the disk.  This lead to one of those infinite loop
      thingees if you actually had a bad disk.
      8ea641c1
  27. 16 Nov, 2004 1 commit
  28. 12 Nov, 2004 1 commit
  29. 03 Nov, 2004 1 commit
  30. 28 Oct, 2004 1 commit
    • Mike Hibler's avatar
      Minor tweaks from a one-day binge of performance analysis. · 1a76e634
      Mike Hibler authored
      The only meaningful change was to insert a sched_yield() in the frisbee
      decompressor path.  Apparently, the decompressor can run long enough to
      cause the incoming socket buffer to overflow.  I was under the assumption
      that the decompressor would not run much longer than a single time slice
      (0.001 seconds, about 8 packets) before its priority would force it to
      be context switched.  But it was running much longer than that!  Forcing
      a periodic yield seems to have taken care of this.
      
      One other cause of retransmitted blocks that I saw was where the server
      was taking a long time to read data from a file (up to 0.25 seconds).
      This would stall the clients and force them to rerequest blocks (which
      they do after about 0.10 seconds).  We can improve on this by splitting
      the file reading off to a seperate thread.
      
      Most other changes are related to the event logging code.
      1a76e634
  31. 29 Sep, 2004 1 commit
  32. 10 May, 2004 1 commit
  33. 22 Mar, 2004 1 commit
  34. 08 Mar, 2004 1 commit
  35. 15 Oct, 2003 1 commit
    • Mike Hibler's avatar
      Uniform syslog'ing. Change everything I could find to use a syslog facility · cc6d6fa7
      Mike Hibler authored
      as defined in the defs-* file (e.g. "TBLOGFACIL=local2").  The default is
      "local5" which is what we are setup to use so you shouldn't need to mess
      with your defs- file!
      
      perl scripts just get this value configured in when configure is run.
      C programs get the value in two ways.  For programs that are intimate with
      the testbed infrastructure, and include "config.h", they just get it from
      that file.  For programs that we sometimes use outside the Emulab build
      environment (e.g., frisbee, capture) and that don't include config.h,
      the value is set via a "-DLOG_TESTBED=..." in the GNUmakefile build line.
      If the value isn't set, it defaults to what it used to be (usually LOG_USER).
      
      Still to do: healthd, hmcd (whose build doesn't seem to be completely
      integrated) and plabdaemon.in (since its icky python :-)
      cc6d6fa7
  36. 14 Jun, 2003 3 commits