1. 13 Apr, 2009 1 commit
  2. 07 Apr, 2009 1 commit
    • Peter W Morreale's avatar
      mm: add /proc controls for pdflush threads · fafd688e
      Peter W Morreale authored
      Add /proc entries to give the admin the ability to control the minimum and
      maximum number of pdflush threads.  This allows finer control of pdflush
      on both large and small machines.
      The rationale is simply one size does not fit all.  Admins on large and/or
      small systems may want to tune the min/max pdflush thread count to best
      suit their needs.  Right now the min/max is hardcoded to 2/8.  While
      probably a fair estimate for smaller machines, large machines with large
      numbers of CPUs and large numbers of filesystems/block devices may benefit
      from larger numbers of threads working on different block devices.
      Even if the background flushing algorithm is radically changed, it is
      still likely that multiple threads will be involved and admins would still
      desire finer control on the min/max other than to have to recompile the
      The patch adds '/proc/sys/vm/nr_pdflush_threads_min' and
      '/proc/sys/vm/nr_pdflush_threads_max' with r/w permissions.
      The minimum value for nr_pdflush_threads_min is 1 and the maximum value is
      the current value of nr_pdflush_threads_max.  This minimum is required
      since additional thread creation is performed in a pdflush thread itself.
      The minimum value for nr_pdflush_threads_max is the current value of
      nr_pdflush_threads_min and the maximum value can be 1000.
      Documentation/sysctl/vm.txt is also updated.
      [akpm@linux-foundation.org: fix comment, fix whitespace, use __read_mostly]
      Signed-off-by: default avatarPeter W Morreale <pmorreale@novell.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  3. 06 Apr, 2009 1 commit
  4. 03 Apr, 2009 1 commit
  5. 02 Apr, 2009 1 commit
  6. 01 Apr, 2009 1 commit
  7. 12 Feb, 2009 1 commit
  8. 11 Feb, 2009 1 commit
  9. 16 Jan, 2009 1 commit
  10. 15 Jan, 2009 1 commit
  11. 14 Jan, 2009 2 commits
  12. 08 Jan, 2009 1 commit
    • Paul Mundt's avatar
      NOMMU: Make mmap allocation page trimming behaviour configurable. · dd8632a1
      Paul Mundt authored
      NOMMU mmap allocates a piece of memory for an mmap that's rounded up in size to
      the nearest power-of-2 number of pages.  Currently it then discards the excess
      pages back to the page allocator, making that memory available for use by other
      things.  This can, however, cause greater amount of fragmentation.
      To counter this, a sysctl is added in order to fine-tune the trimming
      behaviour.  The default behaviour remains to trim pages aggressively, while
      this can either be disabled completely or set to a higher page-granular
      watermark in order to have finer-grained control.
      vm region vm_top bits taken from an earlier patch by David Howells.
      Signed-off-by: default avatarPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarMike Frysinger <vapier.adi@gmail.com>
  13. 06 Jan, 2009 1 commit
    • David Rientjes's avatar
      mm: add dirty_background_bytes and dirty_bytes sysctls · 2da02997
      David Rientjes authored
      This change introduces two new sysctls to /proc/sys/vm:
      dirty_background_bytes and dirty_bytes.
      dirty_background_bytes is the counterpart to dirty_background_ratio and
      dirty_bytes is the counterpart to dirty_ratio.
      With growing memory capacities of individual machines, it's no longer
      sufficient to specify dirty thresholds as a percentage of the amount of
      dirtyable memory over the entire system.
      dirty_background_bytes and dirty_bytes specify quantities of memory, in
      bytes, that represent the dirty limits for the entire system.  If either
      of these values is set, its value represents the amount of dirty memory
      that is needed to commence either background or direct writeback.
      When a `bytes' or `ratio' file is written, its counterpart becomes a
      function of the written value.  For example, if dirty_bytes is written to
      be 8096, 8K of memory is required to commence direct writeback.
      dirty_ratio is then functionally equivalent to 8K / the amount of
      dirtyable memory:
      	dirtyable_memory = free pages + mapped pages + file cache
      	dirty_background_bytes = dirty_background_ratio * dirtyable_memory
      	dirty_background_ratio = dirty_background_bytes / dirtyable_memory
      	dirty_bytes = dirty_ratio * dirtyable_memory
      	dirty_ratio = dirty_bytes / dirtyable_memory
      Only one of dirty_background_bytes and dirty_background_ratio may be
      specified at a time, and only one of dirty_bytes and dirty_ratio may be
      specified.  When one sysctl is written, the other appears as 0 when read.
      The `bytes' files operate on a page size granularity since dirty limits
      are compared with ZVC values, which are in page units.
      Prior to this change, the minimum dirty_ratio was 5 as implemented by
      get_dirty_limits() although /proc/sys/vm/dirty_ratio would show any user
      written value between 0 and 100.  This restriction is maintained, but
      dirty_bytes has a lower limit of only one page.
      Also prior to this change, the dirty_background_ratio could not equal or
      exceed dirty_ratio.  This restriction is maintained in addition to
      restricting dirty_background_bytes.  If either background threshold equals
      or exceeds that of the dirty threshold, it is implicitly set to half the
      dirty threshold.
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Andrea Righi <righi.andrea@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  14. 18 Dec, 2008 1 commit
    • Steven Rostedt's avatar
      trace: add a way to enable or disable the stack tracer · f38f1d2a
      Steven Rostedt authored
      Impact: enhancement to stack tracer
      The stack tracer currently is either on when configured in or
      off when it is not. It can not be disabled when it is configured on.
      (besides disabling the function tracer that it uses)
      This patch adds a way to enable or disable the stack tracer at
      run time. It defaults off on bootup, but a kernel parameter 'stacktrace'
      has been added to enable it on bootup.
      A new sysctl has been added "kernel.stack_tracer_enabled" to let
      the user enable or disable the stack tracer at run time.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  15. 04 Dec, 2008 1 commit
    • David S. Miller's avatar
      sparc64: Add tsb-ratio sysctl. · 0871420f
      David S. Miller authored
      Add a sysctl to tweak the RSS limit used to decide when to grow
      the TSB for an address space.
      In order to avoid expensive divides and multiplies only simply
      positive and negative powers of two are supported.
      The function computed takes the number of TSB translations that will
      fit at one time in the TSB of a given size, and either adds or
      subtracts a percentage of entries.  This final value is the
      RSS limit.
      See tsb_size_to_rss_limit().
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  16. 01 Dec, 2008 1 commit
    • Davide Libenzi's avatar
      epoll: introduce resource usage limits · 7ef9964e
      Davide Libenzi authored
      It has been thought that the per-user file descriptors limit would also
      limit the resources that a normal user can request via the epoll
      interface.  Vegard Nossum reported a very simple program (a modified
      version attached) that can make a normal user to request a pretty large
      amount of kernel memory, well within the its maximum number of fds.  To
      solve such problem, default limits are now imposed, and /proc based
      configuration has been introduced.  A new directory has been created,
      named /proc/sys/fs/epoll/ and inside there, there are two configuration
        max_user_instances = Maximum number of devices - per user
        max_user_watches   = Maximum number of "watched" fds - per user
      The current default for "max_user_watches" limits the memory used by epoll
      to store "watches", to 1/32 of the amount of the low RAM.  As example, a
      256MB 32bit machine, will have "max_user_watches" set to roughly 90000.
      That should be enough to not break existing heavy epoll users.  The
      default value for "max_user_instances" is set to 128, that should be
      enough too.
      This also changes the userspace, because a new error code can now come out
      from EPOLL_CTL_ADD (-ENOSPC).  The EMFILE from epoll_create() was already
      listed, so that should be ok.
      [akpm@linux-foundation.org: use get_current_user()]
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: <stable@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Reported-by: default avatarVegard Nossum <vegardno@ifi.uio.no>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  17. 13 Nov, 2008 1 commit
  18. 04 Nov, 2008 1 commit
  19. 27 Oct, 2008 1 commit
    • Steven Rostedt's avatar
      ftrace: ftrace dump on oops control · 944ac425
      Steven Rostedt authored
      Impact: add (default-off) dump-trace-on-oops flag
      Currently, ftrace is set up to dump its contents to the console if the
      kernel panics or oops. This can be annoying if you have trace data in
      the buffers and you experience an oops, but the trace data is old or
      Usually when you want ftrace to dump its contents is when you are debugging
      your system and you have set up ftrace to trace the events leading to
      an oops.
      This patch adds a control variable called "ftrace_dump_on_oops" that will
      enable the ftrace dump to console on oops. This variable is default off
      but a developer can enable it either through the kernel command line
      by adding "ftrace_dump_on_oops" or at run time by setting (or disabling)
         Replaced /** with /* as Randy explained that kernel-doc does
          not yet handle variables.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  20. 20 Oct, 2008 3 commits
  21. 16 Oct, 2008 3 commits
  22. 09 Oct, 2008 1 commit
  23. 29 Sep, 2008 1 commit
    • Thomas Petazzoni's avatar
      Configure out file locking features · bfcd17a6
      Thomas Petazzoni authored
      This patch adds the CONFIG_FILE_LOCKING option which allows to remove
      support for advisory locks. With this patch enabled, the flock()
      system call, the F_GETLK, F_SETLK and F_SETLKW operations of fcntl()
      and NFS support are disabled. These features are not necessarly needed
      on embedded systems. It allows to save ~11 Kb of kernel code and data:
         text          data     bss     dec     hex filename
      1125436        118764  212992 1457192  163c28 vmlinux.old
      1114299        118564  212992 1445855  160fdf vmlinux
       -11137    -200       0  -11337   -2C49 +/-
      This patch has originally been written by Matt Mackall
      <mpm@selenic.com>, and is part of the Linux Tiny project.
      Signed-off-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: default avatarMatt Mackall <mpm@selenic.com>
      Cc: matthew@wil.cx
      Cc: linux-fsdevel@vger.kernel.org
      Cc: mpm@selenic.com
      Cc: akpm@linux-foundation.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
  24. 12 Sep, 2008 2 commits
  25. 04 Sep, 2008 1 commit
  26. 27 Jul, 2008 1 commit
  27. 26 Jul, 2008 5 commits
    • Al Viro's avatar
      [PATCH] sanitize ->permission() prototype · e6305c43
      Al Viro authored
      * kill nameidata * argument; map the 3 bits in ->flags anybody cares
        about to new MAY_... ones and pass with the mask.
      * kill redundant gfs2_iop_permission()
      * sanitize ecryptfs_permission()
      * fix remaining places where ->permission() instances might barf on new
        MAY_... found in mask.
      The obvious next target in that direction is permission(9)
      folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
    • Al Viro's avatar
      [PATCH] sanitize proc_sysctl · 9043476f
      Al Viro authored
      * keep references to ctl_table_head and ctl_table in /proc/sys inodes
      * grab the former during operations, use the latter for access to
        entry if that succeeds
      * have ->d_compare() check if table should be seen for one who does lookup;
        that allows us to avoid flipping inodes - if we have the same name resolve
        to different things, we'll just keep several dentries and ->d_compare()
        will reject the wrong ones.
      * have ->lookup() and ->readdir() scan the table of our inode first, then
        walk all ctl_table_header and scan ->attached_by for those that are
        attached to our directory.
      * implement ->getattr().
      * get rid of insane amounts of tree-walking
      * get rid of the need to know dentry in ->permission() and of the contortions
        induced by that.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
    • Al Viro's avatar
      [PATCH] sysctl: keep track of tree relationships · ae7edecc
      Al Viro authored
      In a sense, that's the heart of the series.  It's based on the following
      property of the trees we are actually asked to add: they can be split into
      stem that is already covered by registered trees and crown that is entirely
      new.  IOW, if a/b and a/c/d are introduced by our tree, then a/c is also
      introduced by it.
      That allows to associate tree and table entry with each node in the union;
      while directory nodes might be covered by many trees, only one will cover
      the node by its crown.  And that will allow much saner logics for /proc/sys
      in the next patches.  This patch introduces the data structures needed to
      keep track of that.
      When adding a sysctl table, we find a "parent" one.  Which is to say,
      find the deepest node on its stem that already is present in one of the
      tables from our table set or its ancestor sets.  That table will be our
      parent and that node in it - attachment point.  Add our table to list
      anchored in parent, have it refer the parent and contents of attachment
      point.  Also remember where its crown lives.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
    • Al Viro's avatar
      [PATCH] allow delayed freeing of ctl_table_header · f7e6ced4
      Al Viro authored
      Refcount the sucker; instead of freeing it by the end of unregistration
      just drop the refcount and free only when it hits zero.  Make sure that
      we _always_ make ->unregistering non-NULL in start_unregistering().
      That allows anybody to get a reference to such puppy, preventing its
      freeing and reuse.  It does *not* block unregistration.  Anybody who
      holds such a reference can
      	* try to grab a "use" reference (ctl_head_grab()); that will
      succeeds if and only if it hadn't entered unregistration yet.  If it
      succeeds, we can use it in all normal ways until we release the "use"
      reference (with ctl_head_finish()).  Note that this relies on having
      ->unregistering become non-NULL in all cases when one starts to unregister
      the sucker.
      	* keep pointers to ctl_table entries; they *can* be freed if
      the entire thing is unregistered.  However, if ctl_head_grab() succeeds,
      we know that unregistration had not happened (and will not happen until
      ctl_head_finish()) and such pointers can be used safely.
      IOW, now we can have inodes under /proc/sys keep references to ctl_table
      entries, protecting them with references to ctl_table_header and
      grabbing the latter for the duration of operations that require access
      to ctl_table.  That won't cause deadlocks, since unregistration will not
      be stopped by mere keeping a reference to ctl_table_header.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
    • Al Viro's avatar
      [PATCH] beginning of sysctl cleanup - ctl_table_set · 73455092
      Al Viro authored
      New object: set of sysctls [currently - root and per-net-ns].
      Contains: pointer to parent set, list of tables and "should I see this set?"
      method (->is_seen(set)).
      Current lists of tables are subsumed by that; net-ns contains such a beast.
      ->lookup() for ctl_table_root returns pointer to ctl_table_set instead of
      that to ->list of that ctl_table_set.
      [folded compile fixes by rdd for configs without sysctl]
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
  28. 25 Jul, 2008 1 commit
    • Dave Young's avatar
      printk ratelimiting rewrite · 717115e1
      Dave Young authored
      All ratelimit user use same jiffies and burst params, so some messages
      (callbacks) will be lost.
      For example:
      a call printk_ratelimit(5 * HZ, 1)
      b call printk_ratelimit(5 * HZ, 1) before the 5*HZ timeout of a, then b will
      will be supressed.
      - rewrite __ratelimit, and use a ratelimit_state as parameter.  Thanks for
        hints from andrew.
      - Add WARN_ON_RATELIMIT, update rcupreempt.h
      - remove __printk_ratelimit
      - use __ratelimit in net_ratelimit
      Signed-off-by: default avatarDave Young <hidave.darkstar@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: Dave Young <hidave.darkstar@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  29. 24 Jul, 2008 2 commits