1. 16 May, 2013 9 commits
  2. 08 May, 2013 1 commit
  3. 06 May, 2013 6 commits
    • Dan Carpenter's avatar
      tipc: potential divide by zero in tipc_link_recv_fragment() · 6bf15191
      Dan Carpenter authored
      The worry here is that fragm_sz could be zero since it comes from
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Dan Carpenter's avatar
      tipc: add a bounds check in link_recv_changeover_msg() · cb4b102f
      Dan Carpenter authored
      The bearer_id here comes from skb->data and it can be a number from 0 to
      7.  The problem is that the ->links[] array has only 2 elements so I
      have added a range check.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Dan Carpenter's avatar
      netpoll: inverted down_trylock() test · a3dbbc2b
      Dan Carpenter authored
      The return value is reversed from mutex_trylock().
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Al Viro's avatar
      rps_dev_flow_table_release(): no need to delay vfree() · 243198d0
      Al Viro authored
      The same story as with fib_trie patch - vfree() from RCU callbacks
      is legitimate now.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Al Viro's avatar
      fib_trie: no need to delay vfree() · 00203563
      Al Viro authored
      Now that vfree() can be called from interrupt contexts, there's no
      need to play games with schedule_work() to escape calling vfree()
      from RCU callbacks.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Konstantin Khlebnikov's avatar
      net: frag, fix race conditions in LRU list maintenance · b56141ab
      Konstantin Khlebnikov authored
      This patch fixes race between inet_frag_lru_move() and inet_frag_lru_add()
      which was introduced in commit 3ef0eb0d
      ("net: frag, move LRU list maintenance outside of rwlock")
      One cpu already added new fragment queue into hash but not into LRU.
      Other cpu found it in hash and tries to move it to the end of LRU.
      This leads to NULL pointer dereference inside of list_move_tail().
      Another possible race condition is between inet_frag_lru_move() and
      inet_frag_lru_del(): move can happens after deletion.
      This patch initializes LRU list head before adding fragment into hash and
      inet_frag_lru_move() doesn't touches it if it's empty.
      I saw this kernel oops two times in a couple of days.
      [119482.128853] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [119482.132693] IP: [<ffffffff812ede89>] __list_del_entry+0x29/0xd0
      [119482.136456] PGD 2148f6067 PUD 215ab9067 PMD 0
      [119482.140221] Oops: 0000 [#1] SMP
      [119482.144008] Modules linked in: vfat msdos fat 8021q fuse nfsd auth_rpcgss nfs_acl nfs lockd sunrpc ppp_async ppp_generic bridge slhc stp llc w83627ehf hwmon_vid snd_hda_codec_hdmi snd_hda_codec_realtek kvm_amd k10temp kvm snd_hda_intel snd_hda_codec edac_core radeon snd_hwdep ath9k snd_pcm ath9k_common snd_page_alloc ath9k_hw snd_timer snd soundcore drm_kms_helper ath ttm r8169 mii
      [119482.152692] CPU 3
      [119482.152721] Pid: 20, comm: ksoftirqd/3 Not tainted 3.9.0-zurg-00001-g9f95269 #132 To Be Filled By O.E.M. To Be Filled By O.E.M./RS880D
      [119482.161478] RIP: 0010:[<ffffffff812ede89>]  [<ffffffff812ede89>] __list_del_entry+0x29/0xd0
      [119482.166004] RSP: 0018:ffff880216d5db58  EFLAGS: 00010207
      [119482.170568] RAX: 0000000000000000 RBX: ffff88020882b9c0 RCX: dead000000200200
      [119482.175189] RDX: 0000000000000000 RSI: 0000000000000880 RDI: ffff88020882ba00
      [119482.179860] RBP: ffff880216d5db58 R08: ffffffff8155c7f0 R09: 0000000000000014
      [119482.184570] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88020882ba00
      [119482.189337] R13: ffffffff81c8d780 R14: ffff880204357f00 R15: 00000000000005a0
      [119482.194140] FS:  00007f58124dc700(0000) GS:ffff88021fcc0000(0000) knlGS:0000000000000000
      [119482.198928] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [119482.203711] CR2: 0000000000000000 CR3: 00000002155f0000 CR4: 00000000000007e0
      [119482.208533] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [119482.213371] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [119482.218221] Process ksoftirqd/3 (pid: 20, threadinfo ffff880216d5c000, task ffff880216d3a9a0)
      [119482.223113] Stack:
      [119482.228004]  ffff880216d5dbd8 ffffffff8155dcda 0000000000000000 ffff000200000001
      [119482.233038]  ffff8802153c1f00 ffff880000289440 ffff880200000014 ffff88007bc72000
      [119482.238083]  00000000000079d5 ffff88007bc72f44 ffffffff00000002 ffff880204357f00
      [119482.243090] Call Trace:
      [119482.248009]  [<ffffffff8155dcda>] ip_defrag+0x8fa/0xd10
      [119482.252921]  [<ffffffff815a8013>] ipv4_conntrack_defrag+0x83/0xe0
      [119482.257803]  [<ffffffff8154485b>] nf_iterate+0x8b/0xa0
      [119482.262658]  [<ffffffff8155c7f0>] ? inet_del_offload+0x40/0x40
      [119482.267527]  [<ffffffff815448e4>] nf_hook_slow+0x74/0x130
      [119482.272412]  [<ffffffff8155c7f0>] ? inet_del_offload+0x40/0x40
      [119482.277302]  [<ffffffff8155d068>] ip_rcv+0x268/0x320
      [119482.282147]  [<ffffffff81519992>] __netif_receive_skb_core+0x612/0x7e0
      [119482.286998]  [<ffffffff81519b78>] __netif_receive_skb+0x18/0x60
      [119482.291826]  [<ffffffff8151a650>] process_backlog+0xa0/0x160
      [119482.296648]  [<ffffffff81519f29>] net_rx_action+0x139/0x220
      [119482.301403]  [<ffffffff81053707>] __do_softirq+0xe7/0x220
      [119482.306103]  [<ffffffff81053868>] run_ksoftirqd+0x28/0x40
      [119482.310809]  [<ffffffff81074f5f>] smpboot_thread_fn+0xff/0x1a0
      [119482.315515]  [<ffffffff81074e60>] ? lg_local_lock_cpu+0x40/0x40
      [119482.320219]  [<ffffffff8106d870>] kthread+0xc0/0xd0
      [119482.324858]  [<ffffffff8106d7b0>] ? insert_kthread_work+0x40/0x40
      [119482.329460]  [<ffffffff816c32dc>] ret_from_fork+0x7c/0xb0
      [119482.334057]  [<ffffffff8106d7b0>] ? insert_kthread_work+0x40/0x40
      [119482.338661] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08
      [119482.343787] RIP  [<ffffffff812ede89>] __list_del_entry+0x29/0xd0
      [119482.348675]  RSP <ffff880216d5db58>
      [119482.353493] CR2: 0000000000000000
      Oops happened on this path:
      ip_defrag() -> ip_frag_queue() -> inet_frag_lru_move() -> list_move_tail() -> __list_del_entry()
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: David S. Miller <davem@davemloft.net>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  4. 05 May, 2013 1 commit
  5. 03 May, 2013 8 commits
  6. 02 May, 2013 5 commits
  7. 01 May, 2013 10 commits
    • Alex Elder's avatar
      libceph: create source file "net/ceph/snapshot.c" · 4f0dcb10
      Alex Elder authored
      This creates a new source file "net/ceph/snapshot.c" to contain
      utility routines related to ceph snapshot contexts.  The main
      motivation was to define ceph_create_snap_context() as a common way
      to create these structures, but I've moved the definitions of
      ceph_get_snap_context() and ceph_put_snap_context() there too.
      (The benefit of inlining those is very small, and I'd rather
      keep this collection of functions together.)
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
    • Alex Elder's avatar
      libceph: fix byte order mismatch · 9ef1ee5a
      Alex Elder authored
      A WATCH op includes an object version.  The version that's supplied
      is incorrectly byte-swapped osd_req_op_watch_init() where it's first
      assigned (it's been this way since that code was first added).
      The result is that the version sent to the osd is wrong, because
      that value gets byte-swapped again in osd_req_encode_op().  This
      is the source of a sparse warning related to improper byte order in
      the assignment.
      The approach of using the version to avoid a race is deprecated
      (see http://tracker.ceph.com/issues/3871), and the watch parameter
      is no longer even examined by the osd.  So fix the assignment in
      osd_req_op_watch_init() so it no longer does the byte swap.
      This resolves:
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
    • Alex Elder's avatar
      libceph: support pages for class request data · 6c57b554
      Alex Elder authored
      Add the ability to provide an array of pages as outbound request
      data for object class method calls.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
    • Alex Elder's avatar
      libceph: fix two messenger bugs · a51b272e
      Alex Elder authored
      This patch makes four small changes in the ceph messenger.
      While getting copyup functionality working I found two bugs in the
      messenger.  Existing paths through the code did not trigger these
      problems, but they're fixed here:
          - In ceph_msg_data_pagelist_cursor_init(), the cursor's
            last_piece field was being checked against the length
            supplied.  This was OK until this commit: ccba6d98 libceph:
            implement multiple data items in a message That commit changed
            the cursor init routines to allow lengths to be supplied that
            exceeded the size of the current data item. Because of this,
            we have to use the assigned cursor resid field rather than the
            provided length in determining whether the cursor points to
            the last piece of a data item.
          - In ceph_msg_data_add_pages(), a BUG_ON() was erroneously
            catching attempts to add page data to a message if the message
            already had data assigned to it. That was OK until that same
            commit, at which point it was fine for messages to have
            multiple data items. It slipped through because that BUG_ON()
            call was present twice in that function. (You can never be too
      In addition two other minor things are changed:
          - In ceph_msg_data_cursor_init(), the local variable "data" was
            getting assigned twice.
          - In ceph_msg_data_advance(), it was assumed that the
            type-specific advance routine would set new_piece to true
            after it advanced past the last piece. That may have been
            fine, but since we check for that case we might as well set it
            explicitly in ceph_msg_data_advance().
      This resolves:
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
    • Alex Elder's avatar
      libceph: support raw data requests · 49719778
      Alex Elder authored
      Allow osd request ops that aren't otherwise structured (not class,
      extent, or watch ops) to specify "raw" data to be used to hold
      incoming data for the op.  Make use of this capability for the osd
      STAT op.
      Prefix the name of the private function osd_req_op_init() with "_",
      and expose a new function by that (earlier) name whose purpose is to
      initialize osd ops with (only) implied data.
      For now we'll just support the use of a page array for an osd op
      with incoming raw data.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
    • Alex Elder's avatar
      libceph: clean up osd data field access functions · 863c7eb5
      Alex Elder authored
      There are a bunch of functions defined to encapsulate getting the
      address of a data field for a particular op in an osd request.
      They're all defined the same way, so create a macro to take the
      place of all of them.
      Two of these are used outside the osd client code, so preserve them
      (but convert them to use the new macro internally).  Stop exporting
      the ones that aren't used elsewhere.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
    • Alex Elder's avatar
      libceph: kill off osd data write_request parameters · 406e2c9f
      Alex Elder authored
      In the incremental move toward supporting distinct data items in an
      osd request some of the functions had "write_request" parameters to
      indicate, basically, whether the data belonged to in_data or the
      out_data.  Now that we maintain the data fields in the op structure
      there is no need to indicate the direction, so get rid of the
      "write_request" parameters.
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
    • Alex Elder's avatar
      libceph: change how "safe" callback is used · 26be8808
      Alex Elder authored
      An osd request currently has two callbacks.  They inform the
      initiator of the request when we've received confirmation for the
      target osd that a request was received, and when the osd indicates
      all changes described by the request are durable.
      The only time the second callback is used is in the ceph file system
      for a synchronous write.  There's a race that makes some handling of
      this case unsafe.  This patch addresses this problem.  The error
      handling for this callback is also kind of gross, and this patch
      changes that as well.
      In ceph_sync_write(), if a safe callback is requested we want to add
      the request on the ceph inode's unsafe items list.  Because items on
      this list must have their tid set (by ceph_osd_start_request()), the
      request added *after* the call to that function returns.  The
      problem with this is that there's a race between starting the
      request and adding it to the unsafe items list; the request may
      already be complete before ceph_sync_write() even begins to put it
      on the list.
      To address this, we change the way the "safe" callback is used.
      Rather than just calling it when the request is "safe", we use it to
      notify the initiator the bounds (start and end) of the period during
      which the request is *unsafe*.  So the initiator gets notified just
      before the request gets sent to the osd (when it is "unsafe"), and
      again when it's known the results are durable (it's no longer
      unsafe).  The first call will get made in __send_request(), just
      before the request message gets sent to the messenger for the first
      time.  That function is only called by __send_queued(), which is
      always called with the osd client's request mutex held.
      We then have this callback function insert the request on the ceph
      inode's unsafe list when we're told the request is unsafe.  This
      will avoid the race because this call will be made under protection
      of the osd client's request mutex.  It also nicely groups the setup
      and cleanup of the state associated with managing unsafe requests.
      The name of the "safe" callback field is changed to "unsafe" to
      better reflect its new purpose.  It has a Boolean "unsafe" parameter
      to indicate whether the request is becoming unsafe or is now safe.
      Because the "msg" parameter wasn't used, we drop that.
      This resolves the original problem reportedin:
      Reported-by: default avatarYan, Zheng <zheng.z.yan@intel.com>
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarYan, Zheng <zheng.z.yan@intel.com>
      Reviewed-by: default avatarSage Weil <sage@inktank.com>
    • Alex Elder's avatar
      libceph: make method call data be a separate data item · 04017e29
      Alex Elder authored
      Right now the data for a method call is specified via a pointer and
      length, and it's copied--along with the class and method name--into
      a pagelist data item to be sent to the osd.  Instead, encode the
      data in a data item separate from the class and method names.
      This will allow large amounts of data to be supplied to methods
      without copying.  Only rbd uses the class functionality right now,
      and when it really needs this it will probably need to use a page
      array rather than a page list.  But this simple implementation
      demonstrates the functionality on the osd client, and that's enough
      for now.
      This resolves:
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>
    • Alex Elder's avatar
      libceph: add, don't set data for a message · 90af3602
      Alex Elder authored
      Change the names of the functions that put data on a pagelist to
      reflect that we're adding to whatever's already there rather than
      just setting it to the one thing.  Currently only one data item is
      ever added to a message, but that's about to change.
      This resolves:
      Signed-off-by: default avatarAlex Elder <elder@inktank.com>
      Reviewed-by: default avatarJosh Durgin <josh.durgin@inktank.com>