1. 22 May, 2009 1 commit
  2. 15 Apr, 2009 1 commit
  3. 29 Dec, 2008 1 commit
  4. 18 Nov, 2008 1 commit
  5. 21 Oct, 2008 7 commits
  6. 09 Oct, 2008 9 commits
    • Tejun Heo's avatar
      block: make partition array dynamic · 540eed56
      Tejun Heo authored
      disk->__part used to be statically allocated to the maximum possible
      number of partitions.  This patch makes partition array allocation
      dynamic.  The added overhead is minimal as only real change is one
      memory dereference changed to RCU one.  This saves both a bit of
      memory and cpu cycles iterating through unoccupied slots and makes
      increasing partition limit easier.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      540eed56
    • Tejun Heo's avatar
      block: introduce partition 0 · b5d0b9df
      Tejun Heo authored
      genhd and partition code handled disk and partitions separately.  All
      information about the whole disk was in struct genhd and partitions in
      struct hd_struct.  However, the whole disk (part0) and other
      partitions have a lot in common and the data structures end up having
      good number of common fields and thus separate code paths doing the
      same thing.  Also, the partition array was indexed by partno - 1 which
      gets pretty confusing at times.
      
      This patch introduces partition 0 and makes the partition array
      indexed by partno.  Following patches will unify the handling of disk
      and parts piece-by-piece.
      
      This patch also implements disk_partitionable() which tests whether a
      disk is partitionable.  With coming dynamic partition array change,
      the most common usage of disk_max_parts() will be testing whether a
      disk is partitionable and the number of max partitions will become
      much less important.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      b5d0b9df
    • Tejun Heo's avatar
      block: fix disk->part[] dereferencing race · e71bf0d0
      Tejun Heo authored
      disk->part[] is protected by its matching bdev's lock.  However,
      non-critical accesses like collecting stats and printing out sysfs and
      proc information used to be performed without any locking.  As
      partitions can come and go dynamically, partitions can go away
      underneath those non-critical accesses.  As some of those accesses are
      writes, this theoretically can lead to silent corruption.
      
      This patch fixes the race by using RCU for the partition array and dev
      reference counter to hold partitions.
      
      * Rename disk->part[] to disk->__part[] to make sure no one outside
        genhd layer proper accesses it directly.
      
      * Use RCU for disk->__part[] dereferencing.
      
      * Implement disk_{get|put}_part() which can be used to get and put
        partitions from gendisk respectively.
      
      * Iterators are implemented to help iterate through all partitions
        safely.
      
      * Functions which require RCU readlock are marked with _rcu suffix.
      
      * Use disk_put_part() in __blkdev_put() instead of directly putting
        the contained kobject.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      e71bf0d0
    • Tejun Heo's avatar
      block: don't depend on consecutive minor space · f331c029
      Tejun Heo authored
      * Implement disk_devt() and part_devt() and use them to directly
        access devt instead of computing it from ->major and ->first_minor.
      
        Note that all references to ->major and ->first_minor outside of
        block layer is used to determine devt of the disk (the part0) and as
        ->major and ->first_minor will continue to represent devt for the
        disk, converting these users aren't strictly necessary.  However,
        convert them for consistency.
      
      * Implement disk_max_parts() to avoid directly deferencing
        genhd->minors.
      
      * Update bdget_disk() such that it doesn't assume consecutive minor
        space.
      
      * Move devt computation from register_disk() to add_disk() and make it
        the only one (all other usages use the initially determined value).
      
      These changes clean up the code and will help disk->part dereference
      fix and extended block device numbers.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      f331c029
    • Tejun Heo's avatar
      block: make variable and argument names more consistent · cf771cb5
      Tejun Heo authored
      In hd_struct, @partno is used to denote partition number and a number
      of other places use @part to denote hd_struct.  Functions use @part
      and @index instead.  This causes confusion and makes it difficult to
      use consistent variable names for hd_struct.  Always use @partno if a
      variable represents partition number.
      
      Also, print out functions use @f or @part for seq_file argument.  Use
      @seqf uniformly instead.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      cf771cb5
    • Tejun Heo's avatar
      block: update add_partition() error handling · 88e34126
      Tejun Heo authored
      d805dda4 tried to fix error case handling in add_partition() but had a
      few problems.
      
      * disk->part[] entry is set early and left dangling if operation
        fails.
      
      * Once device initialized, the last put_device() is responsible for
        freeing all the resources.  The failure path freed part_stats and p
        regardless of put_device() causing double free.
      
      * holders subdir holds reference to the disk device, so failure path
        should remove it to release resources properly which was missing.
      
      This patch fixes the above problems and while at it move partition
      slot busy check into add_partition() for completeness and inlines
      holders subdirectory creation.  Using separate function for it just
      obfuscates the code.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Abdel Benamrouche <draconux@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      88e34126
    • Tejun Heo's avatar
      block: allow deleting zero length partition · ec2cdedf
      Tejun Heo authored
      delete_partition() was noop for zero length partition.  As the
      addition code allows creating zero lenght partition and deletion is
      assumed to always succeed, this causes memory leak for zero length
      partitions.  Allow zero length partitions to end their meaningless
      lives.
      
      While at it, allow deleting zero lenght partition via
      BLKPG_DEL_PARTITION ioctl too.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      ec2cdedf
    • David Woodhouse's avatar
      Allow elevators to sort/merge discard requests · e17fc0a1
      David Woodhouse authored
      But blkdev_issue_discard() still emits requests which are interpreted as
      soft barriers, because naïve callers might otherwise issue subsequent
      writes to those same sectors, which might cross on the queue (if they're
      reallocated quickly enough).
      
      Callers still _can_ issue non-barrier discard requests, but they have to
      take care of queue ordering for themselves.
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      e17fc0a1
    • David Woodhouse's avatar
      Add BLKDISCARD ioctl to allow userspace to discard sectors · d30a2605
      David Woodhouse authored
      We may well want mkfs tools to use this to mark the whole device as
      unwanted before they format it, for example.
      
      The ioctl takes a pair of uint64_ts, which are start offset and length
      in _bytes_. Although at the moment it might make sense for them both to
      be in 512-byte sectors, I don't want to limit the ABI to that.
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      d30a2605
  7. 25 Jul, 2008 1 commit
  8. 10 Oct, 2007 1 commit
  9. 07 May, 2007 1 commit
  10. 20 Feb, 2007 1 commit
    • Peter Zijlstra's avatar
      [PATCH] lockdep: annotate BLKPG_DEL_PARTITION · 6d740cd5
      Peter Zijlstra authored
      >=============================================
      >[ INFO: possible recursive locking detected ]
      >2.6.19-1.2909.fc7 #1
      >---------------------------------------------
      >anaconda/587 is trying to acquire lock:
      > (&bdev->bd_mutex){--..}, at: [<c05fb380>] mutex_lock+0x21/0x24
      >
      >but task is already holding lock:
      > (&bdev->bd_mutex){--..}, at: [<c05fb380>] mutex_lock+0x21/0x24
      >
      >other info that might help us debug this:
      >1 lock held by anaconda/587:
      > #0:  (&bdev->bd_mutex){--..}, at: [<c05fb380>] mutex_lock+0x21/0x24
      >
      >stack backtrace:
      > [<c0405812>] show_trace_log_lvl+0x1a/0x2f
      > [<c0405db2>] show_trace+0x12/0x14
      > [<c0405e36>] dump_stack+0x16/0x18
      > [<c043bd84>] __lock_acquire+0x116/0xa09
      > [<c043c960>] lock_acquire+0x56/0x6f
      > [<c05fb1fa>] __mutex_lock_slowpath+0xe5/0x24a
      > [<c05fb380>] mutex_lock+0x21/0x24
      > [<c04d82fb>] blkdev_ioctl+0x600/0x76d
      > [<c04946b1>] block_ioctl+0x1b/0x1f
      > [<c047ed5a>] do_ioctl+0x22/0x68
      > [<c047eff2>] vfs_ioctl+0x252/0x265
      > [<c047f04e>] sys_ioctl+0x49/0x63
      > [<c0404070>] syscall_call+0x7/0xb
      
      Annotate BLKPG_DEL_PARTITION's bd_mutex locking and add a little comment
      clarifying the bd_mutex locking, because I confused myself and initially
      thought the lock order was wrong too.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6d740cd5
  11. 11 Feb, 2007 1 commit
    • Fabio Massimo Di Nitto's avatar
      [PARTITION]: Add whole_disk attribute. · d18d7682
      Fabio Massimo Di Nitto authored
      Some partitioning systems create special partitions that
      span the entire disk.  One example are Sun partitions, and
      this whole-disk partition exists to tell the firmware the
      extent of the entire device so it can load the boot block
      and do other things.
      
      Such partitions should not be treated as normal partitions,
      because all the other partitions overlap this whole-disk one.
      So we'd see multiple instances of the same UUID etc. which
      we do not want.  udev and friends can thus search for this
      'whole_disk' attribute and use it to decide to ignore the
      partition.
      Signed-off-by: default avatarFabio Massimo Di Nitto <fabbione@ubuntu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d18d7682
  12. 08 Dec, 2006 2 commits
  13. 03 Oct, 2006 1 commit
  14. 14 Jul, 2006 1 commit
  15. 23 Mar, 2006 2 commits
  16. 11 Jan, 2006 1 commit
  17. 08 Jan, 2006 1 commit
    • Christoph Hellwig's avatar
      [PATCH] Add block_device_operations.getgeo block device method · a885c8c4
      Christoph Hellwig authored
      HDIO_GETGEO is implemented in most block drivers, and all of them have to
      duplicate the code to copy the structure to userspace, as well as getting
      the start sector.  This patch moves that to common code [1] and adds a
      ->getgeo method to fill out the raw kernel hd_geometry structure.  For many
      drivers this means ->ioctl can go away now.
      
      [1] the s390 block drivers are odd in this respect.  xpram sets ->start
          to 4 always which seems more than odd, and the dasd driver shifts
          the start offset around, probably because of it's non-standard
          sector size.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Jens Axboe <axboe@suse.de>
      Cc: <mike.miller@hp.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
      Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>
      Cc: Neil Brown <neilb@cse.unsw.edu.au>
      Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: James Bottomley <James.Bottomley@steeleye.com>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      a885c8c4
  18. 04 Nov, 2005 1 commit
  19. 23 Jun, 2005 1 commit
    • Arnd Bergmann's avatar
      [PATCH] block: add unlocked_ioctl support for block devices · bb93e3a5
      Arnd Bergmann authored
      This patch allows block device drivers to convert their ioctl functions to
      unlocked_ioctl() like character devices and other subsystems.  All
      functions that were called with the BKL held before are still used that
      way, but I would not be surprised if it could be removed from the ioctl
      functions in drivers/block/ioctl.c themselves.
      
      As a side note, I found that compat_blkdev_ioctl() acquires the BKL as
      well, which looks like a bug.  I have checked that every user of
      disk->fops->compat_ioctl() in the current git tree gets the BKL itself, so
      it could easily be removed from compat_blkdev_ioctl().
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      bb93e3a5
  20. 16 May, 2005 1 commit
    • Stephen Tweedie's avatar
      [PATCH] Fix root hole in raw device · 68f66feb
      Stephen Tweedie authored
      [Patch] Fix raw device ioctl pass-through
      
      Raw character devices are supposed to pass ioctls through to the block
      devices they are bound to.  Unfortunately, they are using the wrong
      function for this: ioctl_by_bdev(), instead of blkdev_ioctl().
      
      ioctl_by_bdev() performs a set_fs(KERNEL_DS) before calling the ioctl,
      redirecting the user-space buffer access to the kernel address space.
      This is, needless to say, a bad thing.
      
      This was noticed first on s390, where raw IO was non-functioning.  The
      s390 driver config does not actually allow raw IO to be enabled, which
      was the first part of the problem.  Secondly, the s390 kernel address
      space is distinct from user, causing legal raw ioctls to fail.  I've
      reproduced this on a kernel built with 4G:4G split on x86, which fails
      in the same way (-EFAULT if the address does not exist kernel-side;
      returns success without actually populating the user buffer if it does.)
      
      The patch below fixes both the config and address-space problems.  It's
      based closely on a patch by Jan Glauber <jang@de.ibm.com>, which has
      been tested on s390 at IBM.  I've tested it on x86 4G:4G (split address
      space) and x86_64 (common address space).
      
      Kernel-address-space access has been assigned CAN-2005-1264.
      Signed-off-by: default avatarStephen Tweedie <sct@redhat.com>
      Signed-off-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      68f66feb
  21. 16 Apr, 2005 1 commit
    • Linus Torvalds's avatar
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds authored
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4