1. 19 May, 2005 4 commits
  2. 18 May, 2005 9 commits
    • Herbert Xu's avatar
      [IPV4/IPV6] Ensure all frag_list members have NULL sk · 2fdba6b0
      Herbert Xu authored
      Having frag_list members which holds wmem of an sk leads to nightmares
      with partially cloned frag skb's.  The reason is that once you unleash
      a skb with a frag_list that has individual sk ownerships into the stack
      you can never undo those ownerships safely as they may have been cloned
      by things like netfilter.  Since we have to undo them in order to make
      skb_linearize happy this approach leads to a dead-end.
      So let's go the other way and make this an invariant:
      	For any skb on a frag_list, skb->sk must be NULL.
      That is, the socket ownership always belongs to the head skb.
      It turns out that the implementation is actually pretty simple.
      The above invariant is actually violated in the following patch
      for a short duration inside ip_fragment.  This is OK because the
      offending frag_list member is either destroyed at the end of the
      slow path without being sent anywhere, or it is detached from
      the frag_list before being sent.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Evgeniy Polyakov's avatar
      [XFRM]: skb_cow_data() does not set proper owner for new skbs. · d4810200
      Evgeniy Polyakov authored
      It looks like skb_cow_data() does not set 
      proper owner for newly created skb.
      If we have several fragments for skb and some of them
      are shared(?) or cloned (like in async IPsec) there 
      might be a situation when we require recreating skb and 
      thus using skb_copy() for it.
      Newly created skb has neither a destructor nor a socket
      assotiated with it, which must be copied from the old skb.
      As far as I can see, current code sets destructor and socket
      for the first one skb only and uses truesize of the first skb
      only to increment sk_wmem_alloc value.
      If above "analysis" is correct then attached patch fixes that.
      Signed-off-by: default avatarEvgeniy Polyakov <johnpol@2ka.mipt.ru>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • David S. Miller's avatar
    • David S. Miller's avatar
      [TG3]: Refine DMA boundary setting. · 59e6b434
      David S. Miller authored
      Extract DMA boundary bit selection into a seperate
      function, tg3_calc_dma_bndry().  Call this from
      Make DMA test more reliable by using no DMA boundry
      setting during the test.  If the test passes, then
      use the setting we selected before the test.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
    • David S. Miller's avatar
      [TG3]: Set minimal hw interrupt mitigation. · 15f9850d
      David S. Miller authored
      Even though we do software interrupt mitigation
      via NAPI, it still helps to have some minimal
      hw assisted mitigation.
      This helps, particularly, on systems where register
      I/O overhead is much greater than the CPU horsepower.
      For example, it helps on NUMA systems.  In such cases
      the PIO overhead to disable interrupts for NAPI accounts
      for the majority of the packet processing cost.  The
      CPU is fast enough such that only a single packet is
      processed by each NAPI poll call.
      Thanks to Michael Chan for reviewing this patch.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • David S. Miller's avatar
      [TG3]: Add tagged status support. · fac9b83e
      David S. Miller authored
      When supported, use the TAGGED interrupt processing support
      the chip provides.  In this mode, instead of a "on/off" binary
      semaphore, an incrementing tag scheme is used to ACK interrupts.
      All MSI supporting chips support TAGGED mode, so the tg3_msi()
      interrupt handler uses it unconditionally.  This invariant is
      verified when MSI support is tested.
      Since we can invoke tg3_poll() multiple times per interrupt under
      high packet load, we fetch a new copy of the tag value in the
      status block right before we actually do the work.
      Also, because the tagged status tells the chip exactly which
      work we have processed, we can make two optimizations:
      1) tg3_restart_ints() need not check tg3_has_work()
      2) the tg3_timer() need not poke the chip 10 times per
         second to keep from losing interrupt events
      Based upon valuable feedback from Michael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Linus Torvalds's avatar
    • Stephen Tweedie's avatar
      [PATCH] Avoid console spam with ext3 aborted journal. · 30121624
      Stephen Tweedie authored
      Avoid console spam with ext3 aborted journal.
      ext3 usually reports error conditions that it detects in its environment.
      But when its journal gets aborted due to such errors, it can sometimes
      continue to report that condition forever, spamming the console to such
      an extent that the initial first cause of the journal abort can be lost.
      When the journal aborts, we put the filesystem into readonly mode.  Most
      subsequent filesystem operations will get rejected immediately by checks
      for MS_RDONLY either in the filesystem or in the VFS.  But some paths do
      not have such checks --- for example, if we continue to write to a file
      handle that was opened before the fs went readonly.  (We only check for
      the ROFS condition when the file is first opened.)  In these cases, we
      can continue to generate log errors similar to
      EXT3-fs error (device $DEV) in start_transaction: Journal has aborted
      for each subsequent write.
      There is really no point in generating these errors after the initial
      error has been fully reported.  Specifically, if we're starting a
      completely new filesystem operation, and the filesystem is *already*
      readonly (ie. the ext3 layer has already detected and handled the
      underlying jbd abort), and we see an EROFS error, then there is simply
      no point in reporting it again.
      Signed-off-by: default avatarStephen Tweedie <sct@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    • Stephen Tweedie's avatar
      [PATCH] Fix filp being passed through raw ioctl handler · e72022e1
      Stephen Tweedie authored
      Don't pass meaningless file handles to block device ioctls.
      The recent raw IO ioctl-passthrough fix started passing the raw file
      handle into the block device ioctl handler.  That's unlikely to be
      useful, as the file handle is actually open on a character-mode raw
      device, not a block device, so dereferencing it is not going to yield
      useful results to a block device ioctl handler.
      Previously we just passed NULL; also not a value that can usefully
      be dereferenced, but at least if it does happen, we'll oops instead of
      silently pretending that the file is a block device, so NULL is the more
      defensive option here.  This patch reverts to that behaviour.
      Noticed by Al Viro.
      Signed-off-by: default avatarStephen Tweedie <sct@redhat.com>
      Acked-by: default avatarAl Viro <viro@parcelfarce.linux.theplanet.co.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  3. 17 May, 2005 27 commits