1. 12 Feb, 2010 1 commit
    • Patrick McHardy's avatar
      ipv6: fib: fix crash when changing large fib while dumping it · 2bec5a36
      Patrick McHardy authored
      When the fib size exceeds what can be dumped in a single skb, the
      dump is suspended and resumed once the last skb has been received
      by userspace. When the fib is changed while the dump is suspended,
      the walker might contain stale pointers, causing a crash when the
      dump is resumed.
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      IP: [<ffffffffa01bce04>] fib6_walk_continue+0xbb/0x124 [ipv6]
      PGD 5347a067 PUD 65c7067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP
      RIP: 0010:[<ffffffffa01bce04>]
      [<ffffffffa01bce04>] fib6_walk_continue+0xbb/0x124 [ipv6]
      Call Trace:
       [<ffffffff8104aca3>] ? mutex_spin_on_owner+0x59/0x71
       [<ffffffffa01bd105>] inet6_dump_fib+0x11b/0x1b9 [ipv6]
       [<ffffffff81371af4>] netlink_dump+0x5b/0x19e
       [<ffffffff8134f288>] ? consume_skb+0x28/0x2a
       [<ffffffff81373b69>] netlink_recvmsg+0x1ab/0x2c6
       [<ffffffff81372781>] ? netlink_unicast+0xfa/0x151
       [<ffffffff813483e0>] __sock_recvmsg+0x6d/0x79
       [<ffffffff81348a53>] sock_recvmsg+0xca/0xe3
       [<ffffffff81066d4b>] ? autoremove_wake_function+0x0/0x38
       [<ffffffff811ed1f8>] ? radix_tree_lookup_slot+0xe/0x10
       [<ffffffff810b3ed7>] ? find_get_page+0x90/0xa5
       [<ffffffff810b5dc5>] ? filemap_fault+0x201/0x34f
       [<ffffffff810ef152>] ? fget_light+0x2f/0xac
       [<ffffffff813519e7>] ? verify_iovec+0x4f/0x94
       [<ffffffff81349a65>] sys_recvmsg+0x14d/0x223
      Store the serial number when beginning to walk the fib and reload
      pointers when continuing to walk after a change occured. Similar
      to other dumping functions, this might cause unrelated entries to
      be missed when entries are deleted.
      Tested-by: default avatarBen Greear <greearb@candelatech.com>
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  2. 10 Feb, 2010 3 commits
  3. 08 Feb, 2010 2 commits
    • Patrick McHardy's avatar
      netfilter: nf_conntrack: fix hash resizing with namespaces · d696c7bd
      Patrick McHardy authored
      As noticed by Jon Masters <jonathan@jonmasters.org>, the conntrack hash
      size is global and not per namespace, but modifiable at runtime through
      /sys/module/nf_conntrack/hashsize. Changing the hash size will only
      resize the hash in the current namespace however, so other namespaces
      will use an invalid hash size. This can cause crashes when enlarging
      the hashsize, or false negative lookups when shrinking it.
      Move the hash size into the per-namespace data and only use the global
      hash size to initialize the per-namespace value when instanciating a
      new namespace. Additionally restrict hash resizing to init_net for
      now as other namespaces are not handled currently.
      Cc: stable@kernel.org
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Eric Dumazet's avatar
      netfilter: nf_conntrack: per netns nf_conntrack_cachep · 5b3501fa
      Eric Dumazet authored
      nf_conntrack_cachep is currently shared by all netns instances, but
      because of SLAB_DESTROY_BY_RCU special semantics, this is wrong.
      If we use a shared slab cache, one object can instantly flight between
      one hash table (netns ONE) to another one (netns TWO), and concurrent
      reader (doing a lookup in netns ONE, 'finding' an object of netns TWO)
      can be fooled without notice, because no RCU grace period has to be
      observed between object freeing and its reuse.
      We dont have this problem with UDP/TCP slab caches because TCP/UDP
      hashtables are global to the machine (and each object has a pointer to
      its netns).
      If we use per netns conntrack hash tables, we also *must* use per netns
      conntrack slab caches, to guarantee an object can not escape from one
      namespace to another one.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      [Patrick: added unique slab name allocation]
      Cc: stable@kernel.org
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
  4. 04 Feb, 2010 3 commits
    • Sridhar Samudrala's avatar
      packet: Add GSO/csum offload support. · bfd5f4a3
      Sridhar Samudrala authored
      This patch adds GSO/checksum offload to af_packet sockets using
      virtio_net_hdr. Based on Rusty's patch to add this support to tun.
      It allows GSO/checksum offload to be enabled when using raw socket
      backend with virtio_net.
      Adds PACKET_VNET_HDR socket option to prepend virtio_net_hdr in the
      receive path and process/skip virtio_net_hdr in the send path. This
      option is only allowed with SOCK_RAW sockets attached to ethernet
      type devices.
      v2 updates
      Michael's Comments
      - Perform length check in packet_snd() when GSO is off even when
        vnet_hdr is present.
      - Check for SKB_GSO_FCOE type and return -EINVAL
      - don't allow tx/rx ring when vnet_hdr is enabled.
      Herbert's Comments
      - Removed ethernet specific code.
      - protocol value is assumed to be passed in by the caller.
      Signed-off-by: default avatarSridhar Samudrala <sri@us.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Jiri Pirko's avatar
      libphy: add phy_find_first function · f8f76db1
      Jiri Pirko authored
      Many drivers do this in them manually. Now they can use this function.
      Signed-off-by: default avatarJiri Pirko <jpirko@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Jiri Pirko's avatar
      net: use helpers to access mc list V2 · 6683ece3
      Jiri Pirko authored
      This patch introduces the similar helpers as those already done for uc list.
      However multicast lists are no list_head lists but "mademanually". The three
      macros added by this patch will make the transition of mc_list to list_head
      smooth in two steps:
      1) convert all drivers to use these macros (with the original iterator of type
         "struct dev_mc_list")
      2) once all drivers are converted, convert list type and iterators to "struct
         netdev_hw_addr" in one patch.
      >From now on, drivers can (and should) use "netdev_for_each_mc_addr" to iterate
      over the addresses with iterator of type "struct netdev_hw_addr". Also macros
      "netdev_mc_count" and "netdev_mc_empty" to read list's length. This is the state
      which should be reached in all drivers.
      Signed-off-by: default avatarJiri Pirko <jpirko@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  5. 03 Feb, 2010 4 commits
    • Alexey Dobriyan's avatar
      net: CONFIG_COMPAT redux · 1621e094
      Alexey Dobriyan authored
      Ifdef out
      	struct proto_ops::compat_ioctl
      	struct proto_ops::compat_setsockopt
      	struct proto_ops::compat_getsockopt
      to make structures smaller on COMPAT=n kernels.
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Arnd Bergmann's avatar
      net: macvtap driver · 20d29d7a
      Arnd Bergmann authored
      In order to use macvlan with qemu and other tools that require
      a tap file descriptor, the macvtap driver adds a small backend
      with a character device with the same interface as the tun
      driver, with a minimum set of features.
      Macvtap interfaces are created in the same way as macvlan
      interfaces using ip link, but the netif is just used as a
      handle for configuration and accounting, while the data
      goes through the chardev. Each macvtap interface has its
      own character device, simplifying permission management
      significantly over the generic tun/tap driver.
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Stephen Hemminger <shemminger@linux-foundation.org>
      Cc: David S. Miller" <davem@davemloft.net>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Or Gerlitz <ogerlitz@voltaire.com>
      Cc: netdev@vger.kernel.org
      Cc: bridge@lists.linux-foundation.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Arnd Bergmann's avatar
      macvlan: allow multiple driver backends · fc0663d6
      Arnd Bergmann authored
      This makes it possible to hook into the macvlan driver
      from another kernel module. In particular, the goal is
      to extend it with the macvtap backend that provides
      a tun/tap compatible interface directly on the macvlan
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Arnd Bergmann's avatar
      net: maintain namespace isolation between vlan and real device · 8a83a00b
      Arnd Bergmann authored
      In the vlan and macvlan drivers, the start_xmit function forwards
      data to the dev_queue_xmit function for another device, which may
      potentially belong to a different namespace.
      To make sure that classification stays within a single namespace,
      this resets the potentially critical fields.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  6. 02 Feb, 2010 3 commits
  7. 01 Feb, 2010 2 commits
    • Felix Fietkau's avatar
      mac80211: fix monitor mode tx radiotap header handling · 17ad353b
      Felix Fietkau authored
      When an injected frame gets buffered for a powersave STA or filtered
      and retransmitted, mac80211 attempts to parse the radiotap header
      again, which doesn't work because it's gone at that point.
      This patch adds a new flag for checking the availability of a radiotap
      header, so that it only attempts to parse it once, reusing the tx info
      on the next call to ieee80211_tx().
      This fixes severe issues with rekeying in AP mode.
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
    • Luis R. Rodriguez's avatar
      cfg80211: add regulatory hint disconnect support · 09d989d1
      Luis R. Rodriguez authored
      This adds a new regulatory hint to be used when we know all
      devices have been disconnected and idle. This can happen
      when we suspend, for instance. When we disconnect we can
      no longer assume the same regulatory rules learned from
      a country IE or beacon hints are applicable so restore
      regulatory settings to an initial state.
      Since driver hints are cached on the wiphy that called
      the hint, those hints are not reproduced onto cfg80211
      as the wiphy will respect its own wiphy->regd regardless.
      Signed-off-by: default avatarLuis R. Rodriguez <lrodriguez@atheros.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
  8. 28 Jan, 2010 2 commits
  9. 26 Jan, 2010 2 commits
  10. 25 Jan, 2010 2 commits
  11. 24 Jan, 2010 2 commits
  12. 23 Jan, 2010 3 commits
  13. 22 Jan, 2010 3 commits
  14. 21 Jan, 2010 2 commits
  15. 20 Jan, 2010 1 commit
    • Sarah Sharp's avatar
      USB: Fix duplicate sysfs problem after device reset. · 04a723ea
      Sarah Sharp authored
      Borislav Petkov reports issues with duplicate sysfs endpoint files after a
      resume from a hibernate.  It turns out that the code to support alternate
      settings under xHCI has issues when a device with a non-default alternate
      setting is reset during the hibernate:
      [  427.681810] Restarting tasks ...
      [  427.681995] hub 1-0:1.0: state 7 ports 6 chg 0004 evt 0000
      [  427.682019] usb usb3: usb resume
      [  427.682030] ohci_hcd 0000:00:12.0: wakeup root hub
      [  427.682191] hub 1-0:1.0: port 2, status 0501, change 0000, 480 Mb/s
      [  427.682205] usb 1-2: usb wakeup-resume
      [  427.682226] usb 1-2: finish reset-resume
      [  427.682886] done.
      [  427.734658] ehci_hcd 0000:00:12.2: port 2 high speed
      [  427.734663] ehci_hcd 0000:00:12.2: GetStatus port 2 status 001005 POWER sig=se0 PE CONNECT
      [  427.746682] hub 3-0:1.0: hub_reset_resume
      [  427.746693] hub 3-0:1.0: trying to enable port power on non-switchable hub
      [  427.786715] usb 1-2: reset high speed USB device using ehci_hcd and address 2
      [  427.839653] ehci_hcd 0000:00:12.2: port 2 high speed
      [  427.839666] ehci_hcd 0000:00:12.2: GetStatus port 2 status 001005 POWER sig=se0 PE CONNECT
      [  427.847717] ohci_hcd 0000:00:12.0: GetStatus roothub.portstatus [1] = 0x00010100 CSC PPS
      [  427.915497] hub 1-2:1.0: remove_intf_ep_devs: if: ffff88022f9e8800 ->ep_devs_created: 1
      [  427.915774] hub 1-2:1.0: remove_intf_ep_devs: bNumEndpoints: 1
      [  427.915934] hub 1-2:1.0: if: ffff88022f9e8800: endpoint devs removed.
      [  427.916158] hub 1-2:1.0: create_intf_ep_devs: if: ffff88022f9e8800 ->ep_devs_created: 0, ->unregistering: 0
      [  427.916434] hub 1-2:1.0: create_intf_ep_devs: bNumEndpoints: 1
      [  427.916609]  ep_81: create, parent hub
      [  427.916632] ------------[ cut here ]------------
      [  427.916644] WARNING: at fs/sysfs/dir.c:477 sysfs_add_one+0x82/0x96()
      [  427.916649] Hardware name: System Product Name
      [  427.916653] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:12.2/usb1/1-2/1-2:1.0/ep_81'
      [  427.916658] Modules linked in: binfmt_misc kvm_amd kvm powernow_k8 cpufreq_ondemand cpufreq_powersave cpufreq_userspace freq_table cpufreq_conservative ipv6 vfat fat
      +8250_pnp 8250 pcspkr ohci_hcd serial_core k10temp edac_core
      [  427.916694] Pid: 278, comm: khubd Not tainted 2.6.33-rc2-00187-g08d869aa
      -dirty #13
      [  427.916699] Call Trace:
      The problem is caused by a mismatch between the USB core's view of the
      device state and the USB device and xHCI host's view of the device state.
      After the device reset and re-configuration, the device and the xHCI host
      think they are using alternate setting 0 of all interfaces.  However, the
      USB core keeps track of the old state, which may include non-zero
      alternate settings.  It uses intf->cur_altsetting to keep the endpoint
      sysfs files for the old state across the reset.
      The bandwidth allocation functions need to know what the xHCI host thinks
      the current alternate settings are, so original patch set
      intf->cur_altsetting to the alternate setting 0.  This caused duplicate
      endpoint files to be created.
      The solution is to not set intf->cur_altsetting before calling
      usb_set_interface() in usb_reset_and_verify_device().  Instead, we add a
      new flag to struct usb_interface to tell usb_hcd_alloc_bandwidth() to use
      alternate setting 0 as the currently installed alternate setting.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Tested-by: default avatarBorislav Petkov <petkovbb@googlemail.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
  16. 19 Jan, 2010 3 commits
    • David S. Miller's avatar
      net: Unexport napi_gro_flush(). · 11380a4b
      David S. Miller authored
      Nothing outside of net/core/dev.c uses it.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Johannes Berg's avatar
      mac80211: re-enable re-transmission of filtered frames · c6fcf6bc
      Johannes Berg authored
      In an earlier commit,
          mac80211: disable software retry for now
          Pavel Roskin reported a problem that seems to be due to
          software retry of already transmitted frames. It turns
          out that we've never done that correctly, but due to
          some recent changes it now crashes in the TX code. I've
          added a comment in the patch that explains the problem
          better and also points to possible solutions -- which
          I can't implement right now.
      I disabled software retry of failed/filtered frames
      because it was broken. With the work of the previous
      patches, it now becomes fairly easy to re-enable it
      by adding a flag indicating that the frame shouldn't
      be modified, but still running it through the transmit
      handlers to populate the control information.
      Signed-off-by: default avatarJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
    • Anton Vorontsov's avatar
      phylib: Move workqueue initialization to a proper place · 4f9c85a1
      Anton Vorontsov authored
      commit 541cd3ee
       ("phylib: Fix deadlock
      on resume") caused TI DaVinci EMAC ethernet driver to oops upon resume:
       PM: resume of devices complete after 237.098 msecs
       Restarting tasks ... done.
       kernel BUG at kernel/workqueue.c:354!
       Unable to handle kernel NULL pointer dereference at virtual address 00000000
       [<c002c598>] (__bug+0x0/0x2c) from [<c0052a54>] (queue_delayed_work_on+0x74/0xf8)
       [<c00529e0>] (queue_delayed_work_on+0x0/0xf8) from [<c0052b30>] (queue_delayed_work+0x2c/0x30)
      The oops pops up because TI DaVinci EMAC driver detaches PHY on
      suspend and attaches it back on resume. Attaching makes phylib call
      phy_start_machine() that initializes a workqueue. On the other hand,
      PHY's resume routine will call phy_start_machine() again, and that
      will cause the oops since we just destroyed the already scheduled
      This patch fixes the issue by moving workqueue initialization to
      p.s. We don't see this oops with ucc_geth and gianfar drivers because
      they perform a fine-grained suspend, i.e. they just stop the PHYs
      without detaching.
      Reported-by: default avatarSekhar Nori <nsekhar@ti.com>
      Signed-off-by: default avatarAnton Vorontsov <avorontsov@ru.mvista.com>
      Tested-by: default avatarSekhar Nori <nsekhar@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  17. 18 Jan, 2010 2 commits