Skip to content
  • NeilBrown's avatar
    md: make devices disappear when they are no longer needed. · d3374825
    NeilBrown authored
    
    
    Currently md devices, once created, never disappear until the module
    is unloaded.  This is essentially because the gendisk holds a
    reference to the mddev, and the mddev holds a reference to the
    gendisk, this a circular reference.
    
    If we drop the reference from mddev to gendisk, then we need to ensure
    that the mddev is destroyed when the gendisk is destroyed.  However it
    is not possible to hook into the gendisk destruction process to enable
    this.
    
    So we drop the reference from the gendisk to the mddev and destroy the
    gendisk when the mddev gets destroyed.  However this has a
    complication.
    Between the call
       __blkdev_get->get_gendisk->kobj_lookup->md_probe
    and the call
       __blkdev_get->md_open
    
    there is no obvious way to hold a reference on the mddev any more, so
    unless something is done, it will disappear and gendisk will be
    destroyed prematurely.
    
    Also, once we decide to destroy the mddev, there will be an unlockable
    moment before the gendisk is unlinked (blk_unregister_region) during
    which a new reference to the gendisk can be created.  We need to
    ensure that this reference can not be used.  i.e. the ->open must
    fail.
    
    So:
     1/  in md_probe we set a flag in the mddev (hold_active) which
         indicates that the array should be treated as active, even
         though there are no references, and no appearance of activity.
         This is cleared by md_release when the device is closed if it
         is no longer needed.
         This ensures that the gendisk will survive between md_probe and
         md_open.
    
     2/  In md_open we check if the mddev we expect to open matches
         the gendisk that we did open.
         If there is a mismatch we return -ERESTARTSYS and modify
         __blkdev_get to retry from the top in that case.
         In the -ERESTARTSYS sys case we make sure to wait until
         the old gendisk (that we succeeded in opening) is really gone so
         we loop at most once.
    
    Some udev configurations will always open an md device when it first
    appears.   If we allow an md device that was just created by an open
    to disappear on an immediate close, then this can race with such udev
    configurations and result in an infinite loop the device being opened
    and closed, then re-open due to the 'ADD' even from the first open,
    and then close and so on.
    So we make sure an md device, once created by an open, remains active
    at least until some md 'ioctl' has been made on it.  This means that
    all normal usage of md devices will allow them to disappear promptly
    when not needed, but the worst that an incorrect usage will do it
    cause an inactive md device to be left in existence (it can easily be
    removed).
    
    As an array can be stopped by writing to a sysfs attribute
      echo clear > /sys/block/mdXXX/md/array_state
    we need to use scheduled work for deleting the gendisk and other
    kobjects.  This allows us to wait for any pending gendisk deletion to
    complete by simply calling flush_scheduled_work().
    
    
    
    Signed-off-by: default avatarNeilBrown <neilb@suse.de>
    d3374825