1. 15 Oct, 2009 1 commit
  2. 14 Oct, 2009 1 commit
  3. 11 Oct, 2009 2 commits
  4. 09 Oct, 2009 1 commit
  5. 07 Oct, 2009 3 commits
  6. 06 Oct, 2009 4 commits
  7. 05 Oct, 2009 7 commits
  8. 04 Oct, 2009 2 commits
    • Alexey Dobriyan's avatar
    • Jens Axboe's avatar
      Revert "Seperate read and write statistics of in_flight requests" · 0f78ab98
      Jens Axboe authored
      This reverts commit a9327cac.
      
      Corrado Zoccolo <czoccolo@gmail.com> reports:
      
      "with 2.6.32-rc1 I started getting the following strange output from
      "iostat -kx 2":
      Linux 2.6.31bisect (et2) 	04/10/2009 	_i686_	(2 CPU)
      
      avg-cpu:  %user   %nice %system %iowait  %steal   %idle
                10,70    0,00    3,16   15,75    0,00   70,38
      
      Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
      avgrq-sz avgqu-sz   await  svctm  %util
      sda              18,22     0,00    0,67    0,01    14,77     0,02
      43,94     0,01   10,53 39043915,03 2629219,87
      sdb              60,89     9,68   50,79    3,04  1724,43    50,52
      65,95     0,70   13,06 488437,47 2629219,87
      
      avg-cpu:  %user   %nice %system %iowait  %steal   %idle
                 2,72    0,00    0,74    0,00    0,00   96,53
      
      Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
      avgrq-sz avgqu-sz   await  svctm  %util
      sda               0,00     0,00    0,00    0,00     0,00     0,00
      0,00     0,00    0,00   0,00 100,00
      sdb               0,00     0,00    0,00    0,00     0,00     0,00
      0,00     0,00    0,00   0,00 100,00
      
      avg-cpu:  %user   %nice %system %iowait  %steal   %idle
                 6,68    0,00    0,99    0,00    0,00   92,33
      
      Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
      avgrq-sz avgqu-sz   await  svctm  %util
      sda               0,00     0,00    0,00    0,00     0,00     0,00
      0,00     0,00    0,00   0,00 100,00
      sdb               0,00     0,00    0,00    0,00     0,00     0,00
      0,00     0,00    0,00   0,00 100,00
      
      avg-cpu:  %user   %nice %system %iowait  %steal   %idle
                 4,40    0,00    0,73    1,47    0,00   93,40
      
      Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
      avgrq-sz avgqu-sz   await  svctm  %util
      sda               0,00     0,00    0,00    0,00     0,00     0,00
      0,00     0,00    0,00   0,00 100,00
      sdb               0,00     4,00    0,00    3,00     0,00    28,00
      18,67     0,06   19,50 333,33 100,00
      
      Global values for service time and utilization are garbage. For
      interval values, utilization is always 100%, and service time is
      higher than normal.
      
      I bisected it down to:
      [a9327cac
      
      ] Seperate read and write
      statistics of in_flight requests
      and verified that reverting just that commit indeed solves the issue
      on 2.6.32-rc1."
      
      So until this is debugged, revert the bad commit.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      0f78ab98
  9. 03 Oct, 2009 2 commits
  10. 02 Oct, 2009 4 commits
  11. 01 Oct, 2009 6 commits
    • KAMEZAWA Hiroyuki's avatar
      memcg: some modification to softlimit under hierarchical memory reclaim. · 4e649152
      KAMEZAWA Hiroyuki authored
      
      
      This patch clean up/fixes for memcg's uncharge soft limit path.
      
      Problems:
        Now, res_counter_charge()/uncharge() handles softlimit information at
        charge/uncharge and softlimit-check is done when event counter per memcg
        goes over limit. Now, event counter per memcg is updated only when
        memory usage is over soft limit. Here, considering hierarchical memcg
        management, ancesotors should be taken care of.
      
        Now, ancerstors(hierarchy) are handled in charge() but not in uncharge().
        This is not good.
      
        Prolems:
        1. memcg's event counter incremented only when softlimit hits. That's bad.
           It makes event counter hard to be reused for other purpose.
      
        2. At uncharge, only the lowest level rescounter is handled. This is bug.
           Because ancesotor's event counter is not incremented, children should
           take care of them.
      
        3. res_counter_uncharge()'s 3rd argument is NULL in most case.
           ops under res_counter->lock should be small. No "if" sentense is better.
      
      Fixes:
        * Removed soft_limit_xx poitner and checks in charge and uncharge.
          Do-check-only-when-necessary scheme works enough well without them.
      
        * make event-counter of memcg incremented at every charge/uncharge.
          (per-cpu area will be accessed soon anyway)
      
        * All ancestors are checked at soft-limit-check. This is necessary because
          ancesotor's event counter may never be modified. Then, they should be
          checked at the same time.
      Reviewed-by: default avatarDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4e649152
    • Alexey Dobriyan's avatar
      const: constify remaining file_operations · 828c0950
      Alexey Dobriyan authored
      
      
      [akpm@linux-foundation.org: fix KVM]
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Acked-by: default avatarMike Frysinger <vapier@gentoo.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      828c0950
    • Jun'ichi Nomura's avatar
      Add a tracepoint for block request remapping · b0da3f0d
      Jun'ichi Nomura authored
      
      
      Since 2.6.31 now has request-based device-mapper, it's useful to have
      a tracepoint for request-remapping as well as bio-remapping.
      This patch adds a tracepoint for request-remapping, trace_block_rq_remap().
      Signed-off-by: default avatarKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: default avatarJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      b0da3f0d
    • Christoph Hellwig's avatar
      block: allow large discard requests · 67efc925
      Christoph Hellwig authored
      
      
      Currently we set the bio size to the byte equivalent of the blocks to
      be trimmed when submitting the initial DISCARD ioctl.  That means it
      is subject to the max_hw_sectors limitation of the HBA which is
      much lower than the size of a DISCARD request we can support.
      Add a separate max_discard_sectors tunable to limit the size for discard
      requests.
      
      We limit the max discard request size in bytes to 32bit as that is the
      limit for bio->bi_size.  This could be much larger if we had a way to pass
      that information through the block layer.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      67efc925
    • Christoph Hellwig's avatar
      block: use normal I/O path for discard requests · c15227de
      Christoph Hellwig authored
      
      
      prepare_discard_fn() was being called in a place where memory allocation
      was effectively impossible.  This makes it inappropriate for all but
      the most trivial translations of Linux's DISCARD operation to the block
      command set.  Additionally adding a payload there makes the ownership
      of the bio backing unclear as it's now allocated by the device driver
      and not the submitter as usual.
      
      It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether
      the queue supports discard operations or not.  blkdev_issue_discard now
      allocates a one-page, sector-length payload which is the right thing
      for the common ATA and SCSI implementations.
      
      The mtd implementation of prepare_discard_fn() is replaced with simply
      checking for the request being a discard.
      
      Largely based on a previous patch from Matthew Wilcox <matthew@wil.cx>
      which did the prepare_discard_fn but not the different payload allocation
      yet.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      c15227de
    • Zdenek Kabelac's avatar
      Add missing blk_trace_remove_sysfs to be in pair with blk_trace_init_sysfs · 48c0d4d4
      Zdenek Kabelac authored
      Add missing blk_trace_remove_sysfs to be in pair with blk_trace_init_sysfs
      introduced in commit 1d54ad6d
      
      .
      Release kobject also in case the request_fn is NULL.
      
      Problem was noticed via kmemleak backtrace when some sysfs entries were
      note properly destroyed during  device removal:
      
      unreferenced object 0xffff88001aa76640 (size 80):
        comm "lvcreate", pid 2120, jiffies 4294885144
        hex dump (first 32 bytes):
          01 00 00 00 00 00 00 00 f0 65 a7 1a 00 88 ff ff  .........e......
          90 66 a7 1a 00 88 ff ff 86 1d 53 81 ff ff ff ff  .f........S.....
        backtrace:
          [<ffffffff813f9cc6>] kmemleak_alloc+0x26/0x60
          [<ffffffff8111d693>] kmem_cache_alloc+0x133/0x1c0
          [<ffffffff81195891>] sysfs_new_dirent+0x41/0x120
          [<ffffffff81194b0c>] sysfs_add_file_mode+0x3c/0xb0
          [<ffffffff81197c81>] internal_create_group+0xc1/0x1a0
          [<ffffffff81197d93>] sysfs_create_group+0x13/0x20
          [<ffffffff810d8004>] blk_trace_init_sysfs+0x14/0x20
          [<ffffffff8123f45c>] blk_register_queue+0x3c/0xf0
          [<ffffffff812447e4>] add_disk+0x94/0x160
          [<ffffffffa00d8b08>] dm_create+0x598/0x6e0 [dm_mod]
          [<ffffffffa00de951>] dev_create+0x51/0x350 [dm_mod]
          [<ffffffffa00de823>] ctl_ioctl+0x1a3/0x240 [dm_mod]
          [<ffffffffa00de8f2>] dm_compat_ctl_ioctl+0x12/0x20 [dm_mod]
          [<ffffffff81177bfd>] compat_sys_ioctl+0xcd/0x4f0
          [<ffffffff81036ed8>] sysenter_dispatch+0x7/0x2c
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: default avatarZdenek Kabelac <zkabelac@redhat.com>
      Reviewed-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      48c0d4d4
  12. 30 Sep, 2009 2 commits
  13. 29 Sep, 2009 1 commit
  14. 28 Sep, 2009 1 commit
    • Frederic Weisbecker's avatar
      tracing: Pushdown the bkl tracepoints calls · 925936eb
      Frederic Weisbecker authored
      
      
      Currently we are calling the bkl tracepoint callbacks just before the
      bkl lock/unlock operations, ie the tracepoint call is not inside a
      lock_kernel() function but inside a lock_kernel() macro. Hence the
      bkl trace event header must be included from smp_lock.h. This raises
      some nasty circular header dependencies:
      
      linux/smp_lock.h -> trace/events/bkl.h -> trace/define_trace.h
      -> trace/ftrace.h -> linux/ftrace_event.h -> linux/hardirq.h
      -> linux/smp_lock.h
      
      This results in incomplete event declarations, spurious event
      definitions and other kind of funny behaviours.
      
      This is hardly fixable without ugly workarounds. So instead, we push
      the file name, line number and function name as lock_kernel()
      parameters, so that we only deal with the trace event header from
      lib/kernel_lock.c
      
      This adds two parameters to lock_kernel() and unlock_kernel() but
      it should be fine wrt to performances because this pair dos not seem
      to be called in fast paths.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      925936eb
  15. 27 Sep, 2009 2 commits
    • Dave Young's avatar
      tty: Fix regressions caused by commit b50989dc · f278a2f7
      Dave Young authored
      The following commit made console open fails while booting:
      
      	commit b50989dc
      	Author: Alan Cox <alan@linux.intel.com>
      	Date:   Sat Sep 19 13:13:22 2009 -0700
      
      	tty: make the kref destructor occur asynchronously
      
      Due to tty release routines run in a workqueue now, error like the
      following will be reported while booting:
      
      INIT open /dev/console Input/output error
      
      It also causes hibernation regression to appear as reported at
      http://bugzilla.kernel.org/show_bug.cgi?id=14229
      
      
      
      The reason is that now there's latency issue with closing, but when
      we open a "closing not finished" tty, -EIO will be returned.
      
      Fix it as per the following Alan's suggestion:
      
        Fun but it's actually not a bug and the fix is wrong in itself as
        the port may be closing but not yet being destructed, in which case
        it seems to do the wrong thing.  Opening a tty that is closing (and
        could be closing for long periods) is supposed to return -EIO.
      
        I suspect a better way to deal with this and keep the old console
        timing is to split tty->shutdown into two functions.
      
        tty->shutdown() - called synchronously just before we dump the tty
        onto the waitqueue for destruction
      
        tty->cleanup() - called when the destructor runs.
      
        We would then do the shutdown part which can occur in IRQ context
        fine, before queueing the rest of the release (from tty->magic = 0
        ...  the end) to occur asynchronously
      
        The USB update in -next would then need a call like
      
             if (tty->cleanup)
                     tty->cleanup(tty);
      
        at the top of the async function and the USB shutdown to be split
        between shutdown and cleanup as the USB resource cleanup and final
        tidy cannot occur synchronously as it needs to sleep.
      
        In other words the logic becomes
      
             final kref put
                     make object unfindable
      
             async
                     clean it up
      Signed-off-by: default avatarDave Young <hidave.darkstar@gmail.com>
      [ rjw: Rebased on top of 2.6.31-git, reworked the changelog. ]
      Signed-off-by: default avatar"Rafael J. Wysocki" <rjw@sisk.pl>
      [ Changed serial naming to match new rules, dropped tty_shutdown as per
        comments from Alan Stern  - Linus ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f278a2f7
    • Alexey Dobriyan's avatar
      const: mark struct vm_struct_operations · f0f37e2f
      Alexey Dobriyan authored
      
      
      * mark struct vm_area_struct::vm_ops as const
      * mark vm_ops in AGP code
      
      But leave TTM code alone, something is fishy there with global vm_ops
      being used.
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f0f37e2f
  16. 26 Sep, 2009 1 commit