1. 05 Jun, 2015 5 commits
  2. 04 Jun, 2015 3 commits
    • Helge Deller's avatar
      compat: cleanup coding in compat_get_bitmap() and compat_put_bitmap() · 9b7b819c
      Helge Deller authored
      In the functions compat_get_bitmap() and compat_put_bitmap() the
      variable nr_compat_longs stores how many compat_ulong_t words should be
      copied in a loop.
      The copy loop itself is this:
        if (nr_compat_longs-- > 0) {
            if (__get_user(um, umask)) return -EFAULT;
        } else {
            um = 0;
      Since nr_compat_longs gets unconditionally decremented in each loop and
      since it's type is unsigned this could theoretically lead to out of
      bounds accesses to userspace if nr_compat_longs wraps around to
      Although the callers currently do not trigger out-of-bounds accesses, we
      should better implement the loop in a safe way to completely avoid such
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma · ff25ea8f
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
       "We have two small fixes:
         - pl330 termination hang fix by Krzysztof
         - hsu memory leak fix by Peter"
      * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: hsu: Fix memory leak when stopping a running transfer
        dmaengine: pl330: Fix hang on dmaengine_terminate_all on certain boards
    • Alexander Shishkin's avatar
      perf/x86/intel/pt: Fix a refactoring bug · b44a2b53
      Alexander Shishkin authored
      Commit 066450be
       ("perf/x86/intel/pt: Clean up the control flow
      in pt_pmu_hw_init()") changed attribute initialization so that
      only the first attribute gets initialized using
      sysfs_attr_init(), which upsets lockdep.
      This patch fixes the glitch so that all allocated attributes are
      properly initialized thus fixing the lockdep warning reported by
      Tvrtko and Imre.
      Reported-by: default avatarTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Reported-by: default avatarImre Deak <imre.deak@intel.com>
      Signed-off-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: <linux-kernel@vger.kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
  3. 03 Jun, 2015 6 commits
  4. 02 Jun, 2015 7 commits
    • Linus Torvalds's avatar
      Merge tag 'please-pull-rusty' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux · 8cd9234c
      Linus Torvalds authored
      Pull ia64 fix from Tony Luck:
       "Fix some build warnings for ia64 - cpu_callin_map doesn't need to be
      * tag 'please-pull-rusty' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
        ia64: make cpu_callin_map non-volatile.
    • Sasha Levin's avatar
      vfs: read file_handle only once in handle_to_path · 161f873b
      Sasha Levin authored
      We used to read file_handle twice.  Once to get the amount of extra
      bytes, and once to fetch the entire structure.
      This may be problematic since we do size verifications only after the
      first read, so if the number of extra bytes changes in userspace between
      the first and second calls, we'll have an incoherent view of
      Instead, read the constant size once, and copy that over to the final
      structure without having to re-read it again.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Jan Kara's avatar
      lib: Fix strnlen_user() to not touch memory after specified maximum · f18c34e4
      Jan Kara authored
      If the specified maximum length of the string is a multiple of unsigned
      long, we would load one long behind the specified maximum.  If that
      happens to be in a next page, we can hit a page fault although we were
      not expected to.
      Fix the off-by-one bug in the test whether we are at the end of the
      specified range.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Rusty Russell's avatar
      ia64: make cpu_callin_map non-volatile. · 5eda7861
      Rusty Russell authored
      cpumask_test_cpu() doesn't take volatile, unlike the obsoleted
      cpu_isset.  The only place ia64 really cares is the spin waiting for a
      bit; udelay() is probably a barrier but insert barrier() to be sure.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
    • Peter Ujfalusi's avatar
      dmaengine: hsu: Fix memory leak when stopping a running transfer · 42977082
      Peter Ujfalusi authored
      The vd->node is removed from the lists when the transfer started so the
      vchan_get_all_descriptors() will not find it. This results memory leak.
      Signed-off-by: default avatarPeter Ujfalusi <peter.ujfalusi@ti.com>
      [andy: fix the typo to prevent a compilation error]
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
    • Andy Lutomirski's avatar
      x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers · 425be567
      Andy Lutomirski authored
      The early_idt_handlers asm code generates an array of entry
      points spaced nine bytes apart.  It's not really clear from that
      code or from the places that reference it what's going on, and
      the code only works in the first place because GAS never
      generates two-byte JMP instructions when jumping to global
      Clean up the code to generate the correct array stride (member size)
      explicitly. This should be considerably more robust against
      screw-ups, as GAS will warn if a .fill directive has a negative
      count.  Using '. =' to advance would have been even more robust
      (it would generate an actual error if it tried to move
      backwards), but it would pad with nulls, confusing anyone who
      tries to disassemble the code.  The new scheme should be much
      clearer to future readers.
      While we're at it, improve the comments and rename the array and
      common code.
      Binutils may start relaxing jumps to non-weak labels.  If so,
      this change will fix our build, and we may need to backport this
      Before, on x86_64:
        0000000000000000 <early_idt_handlers>:
           0:   6a 00                   pushq  $0x0
           2:   6a 00                   pushq  $0x0
           4:   e9 00 00 00 00          jmpq   9 <early_idt_handlers+0x9>
                                5: R_X86_64_PC32        early_idt_handler-0x4
          48:   66 90                   xchg   %ax,%ax
          4a:   6a 08                   pushq  $0x8
          4c:   e9 00 00 00 00          jmpq   51 <early_idt_handlers+0x51>
                                4d: R_X86_64_PC32       early_idt_handler-0x4
         117:   6a 00                   pushq  $0x0
         119:   6a 1f                   pushq  $0x1f
         11b:   e9 00 00 00 00          jmpq   120 <early_idt_handler>
                                11c: R_X86_64_PC32      early_idt_handler-0x4
        0000000000000000 <early_idt_handler_array>:
           0:   6a 00                   pushq  $0x0
           2:   6a 00                   pushq  $0x0
           4:   e9 14 01 00 00          jmpq   11d <early_idt_handler_common>
          48:   6a 08                   pushq  $0x8
          4a:   e9 d1 00 00 00          jmpq   120 <early_idt_handler_common>
          4f:   cc                      int3
          50:   cc                      int3
         117:   6a 00                   pushq  $0x0
         119:   6a 1f                   pushq  $0x1f
         11b:   eb 03                   jmp    120 <early_idt_handler_common>
         11d:   cc                      int3
         11e:   cc                      int3
         11f:   cc                      int3
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Acked-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Cc: Binutils <binutils@sourceware.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H.J. Lu <hjl.tools@gmail.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/ac027962af343b0c599cbfcf50b945ad2ef3d7a8.1432336324.git.luto@kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    • Joerg Roedel's avatar
      Revert "iommu/amd: Don't allocate with __GFP_ZERO in alloc_coherent" · 2d0ec7a1
      Joerg Roedel authored
      This reverts commit 5fc872c7
      The DMA-API does not strictly require that the memory
      returned by dma_alloc_coherent is zeroed out. For that
      another function (dma_zalloc_coherent) should be used. But
      all other x86 DMA-API implementation I checked zero out the
      memory, so that some drivers rely on it and break when it is
      It seems the (driver-)world is not yet ready for this
      change, so revert it.
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
  5. 01 Jun, 2015 13 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c46a024e
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       1) Various VTI tunnel (mark handling, PMTU) bug fixes from Alexander
          Duyck and Steffen Klassert.
       2) Revert ethtool PHY query change, it wasn't correct.  The PHY address
          selected by the driver running the PHY to MAC connection decides
          what PHY address GET ethtool operations return information from.
       3) Fix handling of sequence number bits for encryption IV generation in
          ESP driver, from Herbert Xu.
       4) UDP can return -EAGAIN when we hit a bad checksum on receive, even
          when there are other packets in the receive queue which is wrong.
          Just respect the error returned from the generic socket recv
          datagram helper.  From Eric Dumazet.
       5) Fix BNA driver firmware loading on big-endian systems, from Ivan
       6) Fix regression in that we were inheriting the congestion control of
          the listening socket for new connections, the intended behavior
          always was to use the default in this case.  From Neal Cardwell.
       7) Fix NULL deref in brcmfmac driver, from Arend van Spriel.
       8) OTP parsing fix in iwlwifi from Liad Kaufman.
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
        vti6: Add pmtu handling to vti6_xmit.
        Revert "net: core: 'ethtool' issue with querying phy settings"
        bnx2x: Move statistics implementation into semaphores
        xen: netback: read hotplug script once at start of day.
        xen: netback: fix printf format string warning
        Revert "netfilter: ensure number of counters is >0 in do_replace()"
        net: dsa: Properly propagate errors from dsa_switch_setup_one
        tcp: fix child sockets to use system default congestion control if not set
        udp: fix behavior of wrong checksums
        sfc: free multiple Rx buffers when required
        bna: fix soft lock-up during firmware initialization failure
        bna: remove unreasonable iocpf timer start
        bna: fix firmware loading on big-endian machines
        bridge: fix br_multicast_query_expired() bug
        via-rhine: Resigning as maintainer
        brcmfmac: avoid null pointer access when brcmf_msgbuf_get_pktid() fails
        mac80211: Fix mac80211.h docbook comments
        iwlwifi: nvm: fix otp parsing in 8000 hw family
        iwlwifi: pcie: fix tracking of cmd_in_flight
        ip_vti/ip6_vti: Preserve skb->mark after rcv_cb call
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 2459c609
      Linus Torvalds authored
      Pull Sparc fixes from David Miller:
       1) Setup the core/threads/sockets bitmaps correctly so that 'lscpus'
          and friends operate properly.  Frtom Chris Hyser.
       2) The bit that normally means "Cached Virtually" on sun4v systems,
          actually changes meaning in M7 and later chips.  Fix from Khalid
       3) One some PCI-E systems we need to probe different OF properties to
          fill in the PCI slot information properly, from Eric Snowberg.
       4) Kill an extraneous memset after kzalloc(), from Christophe Jaillet.
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE
        sparc64: pci slots information is not populated in sysfs
        sparc: kernel: GRPCI2: Remove a useless memset
        sparc64: Setup sysfs to mark LDOM sockets, cores and threads correctly
    • Alex Deucher's avatar
      drm/radeon: use proper ACR regisiter for DCE3.2 · 091f0a70
      Alex Deucher authored
      Using the DCE2 one by accident afer the audio rework.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · fec345ba
      Linus Torvalds authored
      Pull virtio fix from Michael Tsirkin:
       "Last-minute virtio fix for 4.1
        This tweaks an exported user-space header to fix build breakage for
        userspace using it"
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        include/uapi/linux/virtio_balloon.h: include linux/virtio_types.h
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · e453581d
      David S. Miller authored
      Pablo Neira Ayuso says:
      Netfilter fix for net
      The following patch reverts the ebtables chunk that enforces counters that was
      introduced in the recently applied d26e2c9f
       ('Revert "netfilter: ensure
      number of counters is >0 in do_replace()"') since this breaks ebtables.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • David S. Miller's avatar
      Merge tag 'wireless-drivers-for-davem-2015-06-01' of... · cd842a67
      David S. Miller authored
      Merge tag 'wireless-drivers-for-davem-2015-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      Kalle Valo says:
      * fix OTP parsing 8260
      * fix powersave handling for 8260
      * fix null pointer crash
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Steffen Klassert's avatar
      vti6: Add pmtu handling to vti6_xmit. · ccd740cb
      Steffen Klassert authored
      We currently rely on the PMTU discovery of xfrm.
      However if a packet is localy sent, the PMTU mechanism
      of xfrm tries to to local socket notification what
      might not work for applications like ping that don't
      check for this. So add pmtu handling to vti6_xmit to
      report MTU changes immediately.
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • David S. Miller's avatar
      Revert "net: core: 'ethtool' issue with querying phy settings" · 18ec898e
      David S. Miller authored
      This reverts commit f96dee13
      It isn't right, ethtool is meant to manage one PHY instance
      per netdevice at a time, and this is selected by the SET
      command.  Therefore by definition the GET command must only
      return the settings for the configured and selected PHY.
      Reported-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Yuval Mintz's avatar
      bnx2x: Move statistics implementation into semaphores · c6e36d8c
      Yuval Mintz authored
      Commit dff173de
       ("bnx2x: Fix statistics locking scheme") changed the
      bnx2x locking around statistics state into using a mutex - but the lock
      is being accessed via a timer which is forbidden.
      [If compiled with CONFIG_DEBUG_MUTEXES, logs show a warning about
      accessing the mutex in interrupt context]
      This moves the implementation into using a semaphore [with size '1']
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarAriel Elior <Ariel.Elior@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Ian Campbell's avatar
      xen: netback: read hotplug script once at start of day. · 31a41898
      Ian Campbell authored
      When we come to tear things down in netback_remove() and generate the
      uevent it is possible that the xenstore directory has already been
      removed (details below).
      In such cases netback_uevent() won't be able to read the hotplug
      script and will write a xenstore error node.
      A recent change to the hypervisor exposed this race such that we now
      sometimes lose it (where apparently we didn't ever before).
      Instead read the hotplug script configuration during setup and use it
      for the lifetime of the backend device.
      The apparently more obvious fix of moving the transition to
      state=Closed in netback_remove() to after the uevent does not work
      because it is possible that we are already in state=Closed (in
      reaction to the guest having disconnected as it shutdown). Being
      already in Closed means the toolstack is at liberty to start tearing
      down the xenstore directories. In principal it might be possible to
      arrange to unregister the device sooner (e.g on transition to Closing)
      such that xenstore would still be there but this state machine is
      fragile and prone to anger...
      A modern Xen system only relies on the hotplug uevent for driver
      domains, when the backend is in the same domain as the toolstack it
      will run the necessary setup/teardown directly in the correct sequence
      wrt xenstore changes.
      Signed-off-by: default avatarIan Campbell <ian.campbell@citrix.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Ian Campbell's avatar
      xen: netback: fix printf format string warning · dc5e7a81
      Ian Campbell authored
      drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’:
      drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=]
              (txreq.offset&~PAGE_MASK) + txreq.size);
      PAGE_MASK's type can vary by arch, so a cast is needed.
      Signed-off-by: default avatarIan Campbell <ian.campbell@citrix.com>
      v2: Cast to unsigned long, since PAGE_MASK can vary by arch.
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Bernhard Thaler's avatar
      Revert "netfilter: ensure number of counters is >0 in do_replace()" · d26e2c9f
      Bernhard Thaler authored
      This partially reverts commit 1086bbe9 ("netfilter: ensure number of
      counters is >0 in do_replace()") in net/bridge/netfilter/ebtables.c.
      Setting rules with ebtables does not work any more with 1086bbe9 place.
      There is an error message and no rules set in the end.
      ~# ebtables -t nat -A POSTROUTING --src 12:34:56:78:9a:bc -j DROP
      Unable to update the kernel. Two possible causes:
      1. Multiple ebtables programs were executing simultaneously. The ebtables
         userspace tool doesn't by default support multiple ebtables programs
      Reverting the ebtables part of 1086bbe9
       makes this work again.
      Signed-off-by: default avatarBernhard Thaler <bernhard.thaler@wvnet.at>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    • Mikko Rapeli's avatar
      include/uapi/linux/virtio_balloon.h: include linux/virtio_types.h · 8a7b19d8
      Mikko Rapeli authored
      Fixes userspace compilation error:
      error: unknown type name ‘__virtio16’
        __virtio16 tag;
      Signed-off-by: default avatarMikko Rapeli <mikko.rapeli@iki.fi>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
  6. 31 May, 2015 6 commits
    • Khalid Aziz's avatar
      sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE · 494e5b6f
      Khalid Aziz authored
      sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE
      Bit 9 of TTE is CV (Cacheable in V-cache) on sparc v9 processor while
      the same bit 9 is MCDE (Memory Corruption Detection Enable) on M7
      processor. This creates a conflicting usage of the same bit. Kernel
      sets TTE.cv bit on all pages for sun4v architecture which works well
      for sparc v9 but enables memory corruption detection on M7 processor
      which is not the intent. This patch adds code to determine if kernel
      is running on M7 processor and takes steps to not enable memory
      corruption detection in TTE erroneously.
      Signed-off-by: default avatarKhalid Aziz <khalid.aziz@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Eric Snowberg's avatar
      sparc64: pci slots information is not populated in sysfs · f0c1a117
      Eric Snowberg authored
      Add PCI slot numbers within sysfs for PCIe hardware.  Larger
      PCIe systems with nested PCI bridges and slots further
      down on these bridges were not being populated within sysfs.
      This will add ACPI style PCI slot numbers for these systems
      since the OF 'slot-names' information is not available on
      all PCIe platforms.
      Signed-off-by: default avatarEric Snowberg <eric.snowberg@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Christophe Jaillet's avatar
      sparc: kernel: GRPCI2: Remove a useless memset · 8642ad1c
      Christophe Jaillet authored
      grpci2priv is allocated using kzalloc, so there is no need to memset it.
      Signed-off-by: default avatarChristophe Jaillet <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Florian Fainelli's avatar
      net: dsa: Properly propagate errors from dsa_switch_setup_one · 24595346
      Florian Fainelli authored
      While shuffling some code around, dsa_switch_setup_one() was introduced,
      and it was modified to return either an error code using ERR_PTR() or a
      NULL pointer when running out of memory or failing to setup a switch.
      This is a problem for its caler: dsa_switch_setup() which uses IS_ERR()
      and expects to find an error code, not a NULL pointer, so we still try
      to proceed with dsa_switch_setup() and operate on invalid memory
      addresses. This can be easily reproduced by having e.g: the bcm_sf2
      driver built-in, but having no such switch, such that drv->setup will
      Fix this by using PTR_ERR() consistently which is both more informative
      and avoids for the caller to use IS_ERR_OR_NULL().
      Fixes: df197195
       ("net: dsa: split dsa_switch_setup into two functions")
      Reported-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Tested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Neal Cardwell's avatar
      tcp: fix child sockets to use system default congestion control if not set · 9f950415
      Neal Cardwell authored
      Linux 3.17 and earlier are explicitly engineered so that if the app
      doesn't specifically request a CC module on a listener before the SYN
      arrives, then the child gets the system default CC when the connection
      is established. See tcp_init_congestion_control() in 3.17 or earlier,
      which says "if no choice made yet assign the current value set as
      default". The change ("net: tcp: assign tcp cong_ops when tcp sk is
      created") altered these semantics, so that children got their parent
      listener's congestion control even if the system default had changed
      after the listener was created.
      This commit returns to those original semantics from 3.17 and earlier,
      since they are the original semantics from 2007 in 4d4d3d1e ("[TCP]:
      Congestion control initialization."), and some Linux congestion
      control workflows depend on that.
      In summary, if a listener socket specifically sets TCP_CONGESTION to
      "x", or the route locks the CC module to "x", then the child gets
      "x". Otherwise the child gets current system default from
      net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and
      earlier, and this commit restores that.
      Fixes: 55d8694f
       ("net: tcp: assign tcp cong_ops when tcp sk is created")
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Daniel Borkmann <dborkman@redhat.com>
      Cc: Glenn Judd <glenn.judd@morganstanley.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Eric Dumazet's avatar
      udp: fix behavior of wrong checksums · beb39db5
      Eric Dumazet authored
      We have two problems in UDP stack related to bogus checksums :
      1) We return -EAGAIN to application even if receive queue is not empty.
         This breaks applications using edge trigger epoll()
      2) Under UDP flood, we can loop forever without yielding to other
         processes, potentially hanging the host, especially on non SMP.
      This patch is an attempt to make things better.
      We might in the future add extra support for rt applications
      wanting to better control time spent doing a recv() in a hostile
      environment. For example we could validate checksums before queuing
      packets in socket receive queue.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>