1. 31 Dec, 2008 1 commit
    • Nick Piggin's avatar
      shrink struct dentry · c2452f32
      Nick Piggin authored
      struct dentry is one of the most critical structures in the kernel. So it's
      sad to see it going neglected.
      With CONFIG_PROFILING turned on (which is probably the common case at least
      for distros and kernel developers), sizeof(struct dcache) == 208 here
      (64-bit). This gives 19 objects per slab.
      I packed d_mounted into a hole, and took another 4 bytes off the inline
      name length to take the padding out from the end of the structure. This
      shinks it to 200 bytes. I could have gone the other way and increased the
      length to 40, but I'm aiming for a magic number, read on...
      I then got rid of the d_cookie pointer. This shrinks it to 192 bytes. Rant:
      why was this ever a good idea? The cookie system should increase its hash
      size or use a tree or something if lookups are a problem. Also the "fast
      dcookie lookups" in oprofile should be moved into the dcookie code -- how
      can oprofile possibly care about the dcookie_mutex? It gets dropped after
      get_dcookie() returns so it can't be providing any sort of protection.
      At 192 bytes, 21 objects fit into a 4K page, saving about 3MB on my system
      with ~140 000 entries allocated. 192 is also a multiple of 64, so we get
      nice cacheline alignment on 64 and 32 byte line systems -- any given dentry
      will now require 3 cachelines to touch all fields wheras previously it
      would require 4.
      I know the inline name size was chosen quite carefully, however with the
      reduction in cacheline footprint, it should actually be just about as fast
      to do a name lookup for a 36 character name as it was before the patch (and
      faster for other sizes). The memory footprint savings for names which are
      <= 32 or > 36 bytes long should more than make up for the memory cost for
      33-36 byte names.
      Performance is a feature...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
  2. 23 Oct, 2008 3 commits
  3. 24 Aug, 2008 1 commit
  4. 28 Jul, 2008 1 commit
  5. 23 Jun, 2008 1 commit
  6. 19 May, 2008 1 commit
  7. 30 Apr, 2008 1 commit
  8. 22 Apr, 2008 2 commits
  9. 21 Apr, 2008 1 commit
  10. 14 Feb, 2008 1 commit
  11. 21 Oct, 2007 1 commit
    • Al Viro's avatar
      [PATCH] audit: watching subtrees · 74c3cbe3
      Al Viro authored
      New kind of audit rule predicates: "object is visible in given subtree".
      The part that can be sanely implemented, that is.  Limitations:
      	* if you have hardlink from outside of tree, you'd better watch
      it too (or just watch the object itself, obviously)
      	* if you mount something under a watched tree, tell audit
      that new chunk should be added to watched subtrees
      	* if you umount something in a watched tree and it's still mounted
      elsewhere, you will get matches on events happening there.  New command
      tells audit to recalculate the trees, trimming such sources of false
      Note that it's _not_ about path - if something mounted in several places
      (multiple mount, bindings, different namespaces, etc.), the match does
      _not_ depend on which one we are using for access.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
  12. 08 May, 2007 1 commit
    • Eric Dumazet's avatar
      VFS: delay the dentry name generation on sockets and pipes · c23fbb6b
      Eric Dumazet authored
      1) Introduces a new method in 'struct dentry_operations'.  This method
         called d_dname() might be called from d_path() to build a pathname for
         special filesystems.  It is called without locks.
         Future patches (if we succeed in having one common dentry for all
         pipes/sockets) may need to change prototype of this method, but we now
         use : char *d_dname(struct dentry *dentry, char *buffer, int buflen);
      2) Adds a dynamic_dname() helper function that eases d_dname() implementations
      3) Defines d_dname method for sockets : No more sprintf() at socket
         creation.  This is delayed up to the moment someone does an access to
      4) Defines d_dname method for pipes : No more sprintf() at pipe
         creation.  This is delayed up to the moment someone does an access to
      A benchmark consisting of 1.000.000 calls to pipe()/close()/close() gives a
      *nice* speedup on my Pentium(M) 1.6 Ghz :
      3.090 s instead of 3.450 s
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Acked-by: default avatarChristoph Hellwig <hch@infradead.org>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  13. 11 Oct, 2006 1 commit
    • David Howells's avatar
      [PATCH] VFS: Destroy the dentries contributed by a superblock on unmounting · c636ebdb
      David Howells authored
      The attached patch destroys all the dentries attached to a superblock in one go
       (1) Destroying the tree rooted at s_root.
       (2) Destroying every entry in the anon list, one at a time.
       (3) Each entry in the anon list has its subtree consumed from the leaves
      This reduces the amount of work generic_shutdown_super() does, and avoids
      iterating through the dentry_unused list.
      Note that locking is almost entirely absent in the shrink_dcache_for_umount*()
      functions added by this patch.  This is because:
       (1) at the point the filesystem calls generic_shutdown_super(), it is not
           permitted to further touch the superblock's set of dentries, and nor may
           it remove aliases from inodes;
       (2) the dcache memory shrinker now skips dentries that are being unmounted;
       (3) the superblock no longer has any external references through which the VFS
           can reach it.
      Given these points, the only locking we need to do is when we remove dentries
      from the unused list and the name hashes, which we do a directory's worth at a
      We also don't need to guard against reference counts going to zero unexpectedly
      and removing bits of the tree we're working on as nothing else can call dput().
      A cut down version of dentry_iput() has been folded into
      shrink_dcache_for_umount_subtree() function.  Apart from not needing to unlock
      things, it also doesn't need to check for inotify watches.
      In this version of the patch, the complaint about a dentry still being in use
      has been expanded from a single BUG_ON() and now gives much more information.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarNeilBrown <neilb@suse.de>
      Acked-by: default avatarIan Kent <raven@themaw.net>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  14. 22 Sep, 2006 1 commit
    • David Howells's avatar
      NFS: Add dentry materialisation op · 770bfad8
      David Howells authored
      The attached patch adds a new directory cache management function that prepares
      a disconnected anonymous function to be connected into the dentry tree. The
      anonymous dentry is transferred the name and parentage from another dentry.
      The following changes were made in [try #2]:
       (*) d_materialise_dentry() now switches the parentage of the two nodes around
           correctly when one or other of them is self-referential.
      The following changes were made in [try #7]:
       (*) d_instantiate_unique() has had the interior part split out as function
           __d_instantiate_unique(). Callers of this latter function must be holding
           the appropriate locks.
       (*) _d_rehash() has been added as a wrapper around __d_rehash() to call it
           with the most obvious hash list (the one from the name). d_rehash() now
           calls _d_rehash().
       (*) d_materialise_dentry() is now __d_materialise_dentry() and is static.
       (*) d_materialise_unique() added to perform the combination of d_find_alias(),
           d_materialise_dentry() and d_add_unique() that the NFS client was doing
           twice, all within a single dcache_lock critical section. This reduces the
           number of times two different spinlocks were being accessed.
      The following further changes were made:
       (*) Add the dentries onto their parents d_subdirs lists.
      Signed-Off-By: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
  15. 03 Jul, 2006 1 commit
  16. 23 Jun, 2006 1 commit
    • David Howells's avatar
      [PATCH] VFS: Permit filesystem to override root dentry on mount · 454e2398
      David Howells authored
      Extend the get_sb() filesystem operation to take an extra argument that
      permits the VFS to pass in the target vfsmount that defines the mountpoint.
      The filesystem is then required to manually set the superblock and root dentry
      pointers.  For most filesystems, this should be done with simple_set_mnt()
      which will set the superblock pointer and then set the root dentry to the
      superblock's s_root (as per the old default behaviour).
      The get_sb() op now returns an integer as there's now no need to return the
      superblock pointer.
      This patch permits a superblock to be implicitly shared amongst several mount
      points, such as can be done with NFS to avoid potential inode aliasing.  In
      such a case, simple_set_mnt() would not be called, and instead the mnt_root
      and mnt_sb would be set directly.
      The patch also makes the following changes:
       (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
           pointer argument and return an integer, so most filesystems have to change
           very little.
       (*) If one of the convenience function is not used, then get_sb() should
           normally call simple_set_mnt() to instantiate the vfsmount. This will
           always return 0, and so can be tail-called from get_sb().
       (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
           dcache upon superblock destruction rather than shrink_dcache_anon().
           This is required because the superblock may now have multiple trees that
           aren't actually bound to s_root, but that still need to be cleaned up. The
           currently called functions assume that the whole tree is rooted at s_root,
           and that anonymous dentries are not the roots of trees which results in
           dentries being left unculled.
           However, with the way NFS superblock sharing are currently set to be
           implemented, these assumptions are violated: the root of the filesystem is
           simply a dummy dentry and inode (the real inode for '/' may well be
           inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
           with child trees.
           [*] Anonymous until discovered from another tree.
       (*) The documentation has been adjusted, including the additional bit of
           changing ext2_* into foo_* in the documentation.
      [akpm@osdl.org: convert ipath_fs, do other stuff]
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  17. 22 Jun, 2006 1 commit
    • NeilBrown's avatar
      [PATCH] Fix dcache race during umount · 0feae5c4
      NeilBrown authored
      The race is that the shrink_dcache_memory shrinker could get called while a
      filesystem is being unmounted, and could try to prune a dentry belonging to
      that filesystem.
      If it does, then it will call in to iput on the inode while the dentry is
      no longer able to be found by the umounting process.  If iput takes a
      while, generic_shutdown_super could get all the way though
      shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without
      ever waiting on this particular inode.
      Eventually the superblock gets freed anyway and if the iput tried to touch
      it (which some filesystems certainly do), it will lose.  The promised
      "Self-destruct in 5 seconds" doesn't lead to a nice day.
      The race is closed by holding s_umount while calling prune_one_dentry on
      someone else's dentry.  As a down_read_trylock is used,
      shrink_dcache_memory will no longer try to prune the dentry of a filesystem
      that is being unmounted, and unmount will not be able to start until any
      such active prune_one_dentry completes.
      This requires that prune_dcache *knows* which filesystem (if any) it is
      doing the prune on behalf of so that it can be careful of other
      filesystems.  shrink_dcache_memory isn't called it on behalf of any
      filesystem, and so is careful of everything.
      shrink_dcache_anon is now passed a super_block rather than the s_anon list
      out of the superblock, so it can get the s_anon list itself, and can pass
      the superblock down to prune_dcache.
      If prune_dcache finds a dentry that it cannot free, it leaves it where it
      is (at the tail of the list) and exits, on the assumption that some other
      thread will be removing that dentry soon.  To try to make sure that some
      work gets done, a limited number of dnetries which are untouchable are
      skipped over while choosing the dentry to work on.
      I believe this race was first found by Kirill Korotaev.
      Cc: Jan Blunck <jblunck@suse.de>
      Acked-by: default avatarKirill Korotaev <dev@openvz.org>
      Cc: Olaf Hering <olh@suse.de>
      Acked-by: default avatarBalbir Singh <balbir@in.ibm.com>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarBalbir Singh <balbir@in.ibm.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  18. 31 Mar, 2006 1 commit
  19. 25 Mar, 2006 1 commit
    • Nick Piggin's avatar
      [PATCH] inotify: lock avoidance with parent watch status in dentry · c32ccd87
      Nick Piggin authored
      Previous inotify work avoidance is good when inotify is completely unused,
      but it breaks down if even a single watch is in place anywhere in the
      system.  Robin Holt notices that udev is one such culprit - it slows down a
      512-thread application on a 512 CPU system from 6 seconds to 22 minutes.
      Solve this by adding a flag in the dentry that tells inotify whether or not
      its parent inode has a watch on it.  Event queueing to parent will skip
      taking locks if this flag is cleared.  Setting and clearing of this flag on
      all child dentries versus event delivery: this is no in terms of race
      cases, and that was shown to be equivalent to always performing the check.
      The essential behaviour is that activity occuring _after_ a watch has been
      added and _before_ it has been removed, will generate events.
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Cc: Robert Love <rml@novell.com>
      Cc: John McCutchan <ttb@tentacle.dhs.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  20. 07 Feb, 2006 1 commit
  21. 03 Feb, 2006 1 commit
  22. 08 Jan, 2006 1 commit
    • Eric Dumazet's avatar
      [PATCH] shrink dentry struct · 5160ee6f
      Eric Dumazet authored
      Some long time ago, dentry struct was carefully tuned so that on 32 bits
      UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple
      of memory cache lines.
      Then RCU was added and dentry struct enlarged by two pointers, with nice
      results for SMP, but not so good on UP, because breaking the above tuning
      (128 + 8 = 136 bytes)
      This patch reverts this unwanted side effect, by using an union (d_u),
      where d_rcu and d_child are placed so that these two fields can share their
      memory needs.
      At the time d_free() is called (and d_rcu is really used), d_child is known
      to be empty and not touched by the dentry freeing.
      Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so
      the previous content of d_child is not needed if said dentry was unhashed
      but still accessed by a CPU because of RCU constraints)
      As dentry cache easily contains millions of entries, a size reduction is
      worth the extra complexity of the ugly C union.
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Maneesh Soni <maneesh@in.ibm.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Neil Brown <neilb@cse.unsw.edu.au>
      Cc: James Morris <jmorris@namei.org>
      Cc: Stephen Smalley <sds@epoch.ncsc.mil>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  23. 07 Nov, 2005 1 commit
  24. 07 Sep, 2005 1 commit
    • Eric Dumazet's avatar
      [PATCH] struct dentry: place d_hash close to d_parent and d_name to speedup lookups · 3f4bb1f4
      Eric Dumazet authored
      dentry cache uses sophisticated RCU technology (and prefetching if
      available) but touches 2 cache lines per dentry during hlist lookup.
      This patch moves d_hash in the same cache line than d_parent and d_name
      fields so that :
      1) One cache line is needed instead of two.
      2) the hlist_for_each_rcu() prefetching has a chance to bring all the
         needed data in advance, not only the part that includes d_hash.next.
      I also changed one old comment that was wrong for 64bits.
      A further optimisation would be to separate dentry in two parts, one that
      is mostly read, and one writen (d_count/d_lock) to avoid false sharing on
      SMP/NUMA but this would need different field placement depending on 32bits
      or 64bits platform.
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  25. 16 Apr, 2005 1 commit
    • Linus Torvalds's avatar
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds authored
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      Let it rip!