1. 02 Apr, 2009 1 commit
    • Kees Cook's avatar
      modules: sysctl to block module loading · 3d43321b
      Kees Cook authored
      
      
      Implement a sysctl file that disables module-loading system-wide since
      there is no longer a viable way to remove CAP_SYS_MODULE after the system
      bounding capability set was removed in 2.6.25.
      
      Value can only be set to "1", and is tested only if standard capability
      checks allow CAP_SYS_MODULE.  Given existing /dev/mem protections, this
      should allow administrators a one-way method to block module loading
      after initial boot-time module loading has finished.
      Signed-off-by: default avatarKees Cook <kees.cook@canonical.com>
      Acked-by: default avatarSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
      3d43321b
  2. 24 Mar, 2009 1 commit
    • Jason Baron's avatar
      dynamic debug: combine dprintk and dynamic printk · e9d376f0
      Jason Baron authored
      
      
      This patch combines Greg Bank's dprintk() work with the existing dynamic
      printk patchset, we are now calling it 'dynamic debug'.
      
      The new feature of this patchset is a richer /debugfs control file interface,
      (an example output from my system is at the bottom), which allows fined grained
      control over the the debug output. The output can be controlled by function,
      file, module, format string, and line number.
      
      for example, enabled all debug messages in module 'nf_conntrack':
      
      echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control
      
      to disable them:
      
      echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control
      
      A further explanation can be found in the documentation patch.
      Signed-off-by: default avatarGreg Banks <gnb@sgi.com>
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      e9d376f0
  3. 17 Mar, 2009 1 commit
  4. 05 Mar, 2009 1 commit
    • Tejun Heo's avatar
      percpu, module: implement reserved allocation and use it for module percpu variables · edcb4639
      Tejun Heo authored
      
      
      Impact: add reserved allocation functionality and use it for module
      	percpu variables
      
      This patch implements reserved allocation from the first chunk.  When
      setting up the first chunk, arch can ask to set aside certain number
      of bytes right after the core static area which is available only
      through a separate reserved allocator.  This will be used primarily
      for module static percpu variables on architectures with limited
      relocation range to ensure that the module perpcu symbols are inside
      the relocatable range.
      
      If reserved area is requested, the first chunk becomes reserved and
      isn't available for regular allocation.  If the first chunk also
      includes piggy-back dynamic allocation area, a separate chunk mapping
      the same region is created to serve dynamic allocation.  The first one
      is called static first chunk and the second dynamic first chunk.
      Although they share the page map, their different area map
      initializations guarantee they serve disjoint areas according to their
      purposes.
      
      If arch doesn't setup reserved area, reserved allocation is handled
      like any other allocation.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      edcb4639
  5. 20 Feb, 2009 2 commits
    • Tejun Heo's avatar
      percpu: implement new dynamic percpu allocator · fbf59bc9
      Tejun Heo authored
      
      
      Impact: new scalable dynamic percpu allocator which allows dynamic
              percpu areas to be accessed the same way as static ones
      
      Implement scalable dynamic percpu allocator which can be used for both
      static and dynamic percpu areas.  This will allow static and dynamic
      areas to share faster direct access methods.  This feature is optional
      and enabled only when CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is defined by
      arch.  Please read comment on top of mm/percpu.c for details.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      fbf59bc9
    • Tejun Heo's avatar
      module: reorder module pcpu related functions · 6b588c18
      Tejun Heo authored
      
      
      Impact: cleanup
      
      Move percpu_modinit() upwards.  This is to ease further changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      6b588c18
  6. 02 Feb, 2009 1 commit
    • Eric Dumazet's avatar
      modules: Use a better scheme for refcounting · 720eba31
      Eric Dumazet authored
      
      
      Current refcounting for modules (done if CONFIG_MODULE_UNLOAD=y) is
      using a lot of memory.
      
      Each 'struct module' contains an [NR_CPUS] array of full cache lines.
      
      This patch uses existing infrastructure (percpu_modalloc() &
      percpu_modfree()) to allocate percpu space for the refcount storage.
      
      Instead of wasting NR_CPUS*128 bytes (on i386), we now use
      nr_cpu_ids*sizeof(local_t) bytes.
      
      On a typical distro, where NR_CPUS=8, shiping 2000 modules, we reduce
      size of module files by about 2 Mbytes. (1Kb per module)
      
      Instead of having all refcounters in the same memory node - with TLB misses
      because of vmalloc() - this new implementation permits to have better
      NUMA properties, since each  CPU will use storage on its preferred node,
      thanks to percpu storage.
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      720eba31
  7. 14 Jan, 2009 1 commit
  8. 07 Jan, 2009 1 commit
    • Arjan van de Ven's avatar
      async: Asynchronous function calls to speed up kernel boot · 22a9d645
      Arjan van de Ven authored
      
      
      Right now, most of the kernel boot is strictly synchronous, such that
      various hardware delays are done sequentially.
      
      In order to make the kernel boot faster, this patch introduces
      infrastructure to allow doing some of the initialization steps
      asynchronously, which will hide significant portions of the hardware delays
      in practice.
      
      In order to not change device order and other similar observables, this
      patch does NOT do full parallel initialization.
      
      Rather, it operates more in the way an out of order CPU does; the work may
      be done out of order and asynchronous, but the observable effects
      (instruction retiring for the CPU) are still done in the original sequence.
      Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      22a9d645
  9. 06 Jan, 2009 3 commits
  10. 04 Jan, 2009 4 commits
  11. 08 Dec, 2008 1 commit
    • Frederic Weisbecker's avatar
      tracing/function-graph-tracer: introduce __notrace_funcgraph to filter special functions · 8b96f011
      Frederic Weisbecker authored
      
      
      Impact: trace more functions
      
      When the function graph tracer is configured, three more files are not
      traced to prevent only four functions to be traced. And this impacts the
      normal function tracer too.
      
      arch/x86/kernel/process_64/32.c:
      
      I had crashes when I let this file traced. After some debugging, I saw
      that the "current" task point was changed inside__swtich_to(), ie:
      "write_pda(pcurrent, next_p);" inside process_64.c Since the tracer store
      the original return address of the function inside current, we had
      crashes. Only __switch_to() has to be excluded from tracing.
      
      kernel/module.c and kernel/extable.c:
      
      Because of a function used internally by the function graph tracer:
      __kernel_text_address()
      
      To let the other functions inside these files to be traced, this patch
      introduces the __notrace_funcgraph function prefix which is __notrace if
      function graph tracer is configured and nothing if not.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8b96f011
  12. 16 Nov, 2008 2 commits
  13. 15 Nov, 2008 1 commit
    • Steven Rostedt's avatar
      ftrace: pass module struct to arch dynamic ftrace functions · 31e88909
      Steven Rostedt authored
      
      
      Impact: allow archs more flexibility on dynamic ftrace implementations
      
      Dynamic ftrace has largly been developed on x86. Since x86 does not
      have the same limitations as other architectures, the ftrace interaction
      between the generic code and the architecture specific code was not
      flexible enough to handle some of the issues that other architectures
      have.
      
      Most notably, module trampolines. Due to the limited branch distance
      that archs make in calling kernel core code from modules, the module
      load code must create a trampoline to jump to what will make the
      larger jump into core kernel code.
      
      The problem arises when this happens to a call to mcount. Ftrace checks
      all code before modifying it and makes sure the current code is what
      it expects. Right now, there is not enough information to handle modifying
      module trampolines.
      
      This patch changes the API between generic dynamic ftrace code and
      the arch dependent code. There is now two functions for modifying code:
      
        ftrace_make_nop(mod, rec, addr) - convert the code at rec->ip into
             a nop, where the original text is calling addr. (mod is the
             module struct if called by module init)
      
        ftrace_make_caller(rec, addr) - convert the code rec->ip that should
             be a nop into a caller to addr.
      
      The record "rec" now has a new field called "arch" where the architecture
      can add any special attributes to each call site record.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      31e88909
  14. 23 Oct, 2008 1 commit
  15. 21 Oct, 2008 2 commits
    • Andi Kleen's avatar
      Remove stop_machine during module load v2 · d72b3751
      Andi Kleen authored
      
      
      Remove stop_machine during module load v2
      
      module loading currently does a stop_machine on each module load to insert
      the module into the global module lists.  Especially on larger systems this
      can be quite expensive.
      
      It does that to handle concurrent lock lessmodule list readers
      like kallsyms.
      
      I don't think stop_machine() is actually needed to insert something
      into a list though. There are no concurrent writers because the
      module mutex is taken. And the RCU list functions know how to insert
      a node into a list with the right memory ordering so that concurrent
      readers don't go off into the wood.
      
      So remove the stop_machine for the module list insert and just
      do a list_add_rcu() instead.
      
      Module removal will still do a stop_machine of course, it needs
      that for other reasons.
      
      v2: Revised readers based on Paul's comments. All readers that only
          rely on disabled preemption need to be changed to list_for_each_rcu().
          Done that. The others are ok because they have the modules mutex.
          Also added a possible missing preempt disable for print_modules().
      
      [cc Paul McKenney for review. It's not RCU, but quite similar.]
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      d72b3751
    • Rusty Russell's avatar
      module: simplify load_module. · 5e458cc0
      Rusty Russell authored
      
      
      Linus' recent catch of stack overflow in load_module lead me to look
      at the code.  A couple of helpers to get a section address and get
      objects from a section can help clean things up a little.
      
      (And in case you're wondering, the stack size also dropped from 328 to
      284 bytes).
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      5e458cc0
  16. 16 Oct, 2008 3 commits
    • Andi Kleen's avatar
      Make the taint flags reliable · 25ddbb18
      Andi Kleen authored
      
      
      It's somewhat unlikely that it happens, but right now a race window
      between interrupts or machine checks or oopses could corrupt the tainted
      bitmap because it is modified in a non atomic fashion.
      
      Convert the taint variable to an unsigned long and use only atomic bit
      operations on it.
      
      Unfortunately this means the intvec sysctl functions cannot be used on it
      anymore.
      
      It turned out the taint sysctl handler could actually be simplified a bit
      (since it only increases capabilities) so this patch actually removes
      code.
      
      [akpm@linux-foundation.org: remove unneeded include]
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      25ddbb18
    • Jason Baron's avatar
      driver core: basic infrastructure for per-module dynamic debug messages · 346e15be
      Jason Baron authored
      
      
      Base infrastructure to enable per-module debug messages.
      
      I've introduced CONFIG_DYNAMIC_PRINTK_DEBUG, which when enabled centralizes
      control of debugging statements on a per-module basis in one /proc file,
      currently, <debugfs>/dynamic_printk/modules. When, CONFIG_DYNAMIC_PRINTK_DEBUG,
      is not set, debugging statements can still be enabled as before, often by
      defining 'DEBUG' for the proper compilation unit. Thus, this patch set has no
      affect when CONFIG_DYNAMIC_PRINTK_DEBUG is not set.
      
      The infrastructure currently ties into all pr_debug() and dev_dbg() calls. That
      is, if CONFIG_DYNAMIC_PRINTK_DEBUG is set, all pr_debug() and dev_dbg() calls
      can be dynamically enabled/disabled on a per-module basis.
      
      Future plans include extending this functionality to subsystems, that define 
      their own debug levels and flags.
      
      Usage:
      
      Dynamic debugging is controlled by the debugfs file, 
      <debugfs>/dynamic_printk/modules. This file contains a list of the modules that
      can be enabled. The format of the file is as follows:
      
      	<module_name> <enabled=0/1>
      		.
      		.
      		.
      
      	<module_name> : Name of the module in which the debug call resides
      	<enabled=0/1> : whether the messages are enabled or not
      
      For example:
      
      	snd_hda_intel enabled=0
      	fixup enabled=1
      	driver enabled=0
      
      Enable a module:
      
      	$echo "set enabled=1 <module_name>" > dynamic_printk/modules
      
      Disable a module:
      
      	$echo "set enabled=0 <module_name>" > dynamic_printk/modules
      
      Enable all modules:
      
      	$echo "set enabled=1 all" > dynamic_printk/modules
      
      Disable all modules:
      
      	$echo "set enabled=0 all" > dynamic_printk/modules
      
      Finally, passing "dynamic_printk" at the command line enables
      debugging for all modules. This mode can be turned off via the above
      disable command.
      
      [gkh: minor cleanups and tweaks to make the build work quietly]
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      
      346e15be
    • Alexey Dobriyan's avatar
      modules: fix module "notes" kobject leak · e9432093
      Alexey Dobriyan authored
      
      
      Fix "notes" kobject leak
      
      It happens every rmmod if KALLSYMS=y and SYSFS=y.
      
      	# modprobe foo
      
      kobject: 'foo' (ffffffffa00743d0): kobject_add_internal: parent: 'module', set: 'module'
      kobject: 'holders' (ffff88017e7c5770): kobject_add_internal: parent: 'foo', set: '<NULL>'
      kobject: 'foo' (ffffffffa00743d0): kobject_uevent_env
      kobject: 'foo' (ffffffffa00743d0): fill_kobj_path: path = '/module/foo'
      kobject: 'notes' (ffff88017fa9b668): kobject_add_internal: parent: 'foo', set: '<NULL>'
      	  ^^^^^
      
      	# rmmod foo
      
      kobject: 'holders' (ffff88017e7c5770): kobject_cleanup
      kobject: 'holders' (ffff88017e7c5770): auto cleanup kobject_del
      kobject: 'holders' (ffff88017e7c5770): calling ktype release
      kobject: (ffff88017e7c5770): dynamic_kobj_release
      kobject: 'holders': free name
      kobject: 'foo' (ffffffffa00743d0): kobject_cleanup
      kobject: 'foo' (ffffffffa00743d0): does not have a release() function, it is broken and must be fixed.
      kobject: 'foo' (ffffffffa00743d0): auto cleanup 'remove' event
      kobject: 'foo' (ffffffffa00743d0): kobject_uevent_env
      kobject: 'foo' (ffffffffa00743d0): fill_kobj_path: path = '/module/foo'
      kobject: 'foo' (ffffffffa00743d0): auto cleanup kobject_del
      kobject: 'foo': free name
      
      	[whooops]
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Cc: stable <stable@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      e9432093
  17. 14 Oct, 2008 3 commits
    • Steven Rostedt's avatar
      ftrace: remove old pointers to mcount · fed1939c
      Steven Rostedt authored
      
      
      When a mcount pointer is recorded into a table, it is used to add or
      remove calls to mcount (replacing them with nops). If the code is removed
      via removing a module, the pointers still exist.  At modifying the code
      a check is always made to make sure the code being replaced is the code
      expected. In-other-words, the code being replaced is compared to what
      it is expected to be before being replaced.
      
      There is a very small chance that the code being replaced just happens
      to look like code that calls mcount (very small since the call to mcount
      is relative). To remove this chance, this patch adds ftrace_release to
      allow module unloading to remove the pointers to mcount within the module.
      
      Another change for init calls is made to not trace calls marked with
      __init. The tracing can not be started until after init is done anyway.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fed1939c
    • Steven Rostedt's avatar
      ftrace: enable mcount recording for modules · 90d595fe
      Steven Rostedt authored
      
      
      This patch enables the loading of the __mcount_section of modules and
      changing all the callers of mcount into nops.
      
      The modification is done before the init_module function is called, so
      again, we do not need to use kstop_machine to make these changes.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      90d595fe
    • Mathieu Desnoyers's avatar
      tracing: Kernel Tracepoints · 97e1c18e
      Mathieu Desnoyers authored
      Implementation of kernel tracepoints. Inspired from the Linux Kernel
      Markers. Allows complete typing verification by declaring both tracing
      statement inline functions and probe registration/unregistration static
      inline functions within the same macro "DEFINE_TRACE". No format string
      is required. See the tracepoint Documentation and Samples patches for
      usage examples.
      
      Taken from the documentation patch :
      
      "A tracepoint placed in code provides a hook to call a function (probe)
      that you can provide at runtime. A tracepoint can be "on" (a probe is
      connected to it) or "off" (no probe is attached). When a tracepoint is
      "off" it has no effect, except for adding a tiny time penalty (checking
      a condition for a branch) and space penalty (adding a few bytes for the
      function call at the end of the instrumented function and adds a data
      structure in a separate section).  When a tracepoint is "on", the
      function you provide is called each time the tracepoint is executed, in
      the execution context of the caller. When the function provided ends its
      execution, it returns to the caller (continuing from the tracepoint
      site).
      
      You can put tracepoints at important locations in the code. They are
      lightweight hooks that can pass an arbitrary number of parameters, which
      prototypes are described in a tracepoint declaration placed in a header
      file."
      
      Addition and removal of tracepoints is synchronized by RCU using the
      scheduler (and preempt_disable) as guarantees to find a quiescent state
      (this is really RCU "classic"). The update side uses rcu_barrier_sched()
      with call_rcu_sched() and the read/execute side uses
      "preempt_disable()/preempt_enable()".
      
      We make sure the previous array containing probes, which has been
      scheduled for deletion by the rcu callback, is indeed freed before we
      proceed to the next update. It therefore limits the rate of modification
      of a single tracepoint to one update per RCU period. The objective here
      is to permit fast batch add/removal of probes on _different_
      tracepoints.
      
      Changelog :
      - Use #name ":" #proto as string to identify the tracepoint in the
        tracepoint table. This will make sure not type mismatch happens due to
        connexion of a probe with the wrong type to a tracepoint declared with
        the same name in a different header.
      - Add tracepoint_entry_free_old.
      - Change __TO_TRACE to get rid of the 'i' iterator.
      
      Masami Hiramatsu <mhiramat@redhat.com> :
      Tested on x86-64.
      
      Performance impact of a tracepoint : same as markers, except that it
      adds about 70 bytes of instructions in an unlikely branch of each
      instrumented function (the for loop, the stack setup and the function
      call). It currently adds a memory read, a test and a conditional branch
      at the instrumentation site (in the hot path). Immediate values will
      eventually change this into a load immediate, test and branch, which
      removes the memory read which will make the i-cache impact smaller
      (changing the memory read for a load immediate removes 3-4 bytes per
      site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it
      also saves the d-cache hit).
      
      About the performance impact of tracepoints (which is comparable to
      markers), even without immediate values optimizations, tests done by
      Hideo Aoki on ia64 show no regression. His test case was using hackbench
      on a kernel where scheduler instrumentation (about 5 events in code
      scheduler code) was added.
      
      Quoting Hideo Aoki about Markers :
      
      I evaluated overhead of kernel marker using linux-2.6-sched-fixes git
      tree, which includes several markers for LTTng, using an ia64 server.
      
      While the immediate trace mark feature isn't implemented on ia64, there
      is no major performance regression. So, I think that we don't have any
      issues to propose merging marker point patches into Linus's tree from
      the viewpoint of performance impact.
      
      I prepared two kernels to evaluate. The first one was compiled without
      CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS.
      
      I downloaded the original hackbench from the following URL:
      http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c
      
      
      
      I ran hackbench 5 times in each condition and calculated the average and
      difference between the kernels.
      
          The parameter of hackbench: every 50 from 50 to 800
          The number of CPUs of the server: 2, 4, and 8
      
      Below is the results. As you can see, major performance regression
      wasn't found in any case. Even if number of processes increases,
      differences between marker-enabled kernel and marker- disabled kernel
      doesn't increase. Moreover, if number of CPUs increases, the differences
      doesn't increase either.
      
      Curiously, marker-enabled kernel is better than marker-disabled kernel
      in more than half cases, although I guess it comes from the difference
      of memory access pattern.
      
      * 2 CPUs
      
      Number of | without      | with         | diff     | diff    |
      processes | Marker [Sec] | Marker [Sec] |   [Sec]  |   [%]   |
      --------------------------------------------------------------
             50 |      4.811   |       4.872  |  +0.061  |  +1.27  |
            100 |      9.854   |      10.309  |  +0.454  |  +4.61  |
            150 |     15.602   |      15.040  |  -0.562  |  -3.6   |
            200 |     20.489   |      20.380  |  -0.109  |  -0.53  |
            250 |     25.798   |      25.652  |  -0.146  |  -0.56  |
            300 |     31.260   |      30.797  |  -0.463  |  -1.48  |
            350 |     36.121   |      35.770  |  -0.351  |  -0.97  |
            400 |     42.288   |      42.102  |  -0.186  |  -0.44  |
            450 |     47.778   |      47.253  |  -0.526  |  -1.1   |
            500 |     51.953   |      52.278  |  +0.325  |  +0.63  |
            550 |     58.401   |      57.700  |  -0.701  |  -1.2   |
            600 |     63.334   |      63.222  |  -0.112  |  -0.18  |
            650 |     68.816   |      68.511  |  -0.306  |  -0.44  |
            700 |     74.667   |      74.088  |  -0.579  |  -0.78  |
            750 |     78.612   |      79.582  |  +0.970  |  +1.23  |
            800 |     85.431   |      85.263  |  -0.168  |  -0.2   |
      --------------------------------------------------------------
      
      * 4 CPUs
      
      Number of | without      | with         | diff     | diff    |
      processes | Marker [Sec] | Marker [Sec] |   [Sec]  |   [%]   |
      --------------------------------------------------------------
             50 |      2.586   |       2.584  |  -0.003  |  -0.1   |
            100 |      5.254   |       5.283  |  +0.030  |  +0.56  |
            150 |      8.012   |       8.074  |  +0.061  |  +0.76  |
            200 |     11.172   |      11.000  |  -0.172  |  -1.54  |
            250 |     13.917   |      14.036  |  +0.119  |  +0.86  |
            300 |     16.905   |      16.543  |  -0.362  |  -2.14  |
            350 |     19.901   |      20.036  |  +0.135  |  +0.68  |
            400 |     22.908   |      23.094  |  +0.186  |  +0.81  |
            450 |     26.273   |      26.101  |  -0.172  |  -0.66  |
            500 |     29.554   |      29.092  |  -0.461  |  -1.56  |
            550 |     32.377   |      32.274  |  -0.103  |  -0.32  |
            600 |     35.855   |      35.322  |  -0.533  |  -1.49  |
            650 |     39.192   |      38.388  |  -0.804  |  -2.05  |
            700 |     41.744   |      41.719  |  -0.025  |  -0.06  |
            750 |     45.016   |      44.496  |  -0.520  |  -1.16  |
            800 |     48.212   |      47.603  |  -0.609  |  -1.26  |
      --------------------------------------------------------------
      
      * 8 CPUs
      
      Number of | without      | with         | diff     | diff    |
      processes | Marker [Sec] | Marker [Sec] |   [Sec]  |   [%]   |
      --------------------------------------------------------------
             50 |      2.094   |       2.072  |  -0.022  |  -1.07  |
            100 |      4.162   |       4.273  |  +0.111  |  +2.66  |
            150 |      6.485   |       6.540  |  +0.055  |  +0.84  |
            200 |      8.556   |       8.478  |  -0.078  |  -0.91  |
            250 |     10.458   |      10.258  |  -0.200  |  -1.91  |
            300 |     12.425   |      12.750  |  +0.325  |  +2.62  |
            350 |     14.807   |      14.839  |  +0.032  |  +0.22  |
            400 |     16.801   |      16.959  |  +0.158  |  +0.94  |
            450 |     19.478   |      19.009  |  -0.470  |  -2.41  |
            500 |     21.296   |      21.504  |  +0.208  |  +0.98  |
            550 |     23.842   |      23.979  |  +0.137  |  +0.57  |
            600 |     26.309   |      26.111  |  -0.198  |  -0.75  |
            650 |     28.705   |      28.446  |  -0.259  |  -0.9   |
            700 |     31.233   |      31.394  |  +0.161  |  +0.52  |
            750 |     34.064   |      33.720  |  -0.344  |  -1.01  |
            800 |     36.320   |      36.114  |  -0.206  |  -0.57  |
      --------------------------------------------------------------
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: default avatar'Peter Zijlstra' <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      97e1c18e
  18. 10 Oct, 2008 1 commit
    • Greg Kroah-Hartman's avatar
      Staging: add TAINT_CRAP for all drivers/staging code · 061b1bd3
      Greg Kroah-Hartman authored
      
      
      We need to add a flag for all code that is in the drivers/staging/
      directory to prevent all other kernel developers from worrying about
      issues here, and to notify users that the drivers might not be as good
      as they are normally used to.
      
      Based on code from Andreas Gruenbacher and Jeff Mahoney to provide a
      TAINT flag for the support level of a kernel module in the Novell
      enterprise kernel release.
      
      This is the kernel portion of this feature, the ability for the flag to
      be set needs to be done in the build process and will happen in a
      follow-up patch.
      
      Cc: Andreas Gruenbacher <agruen@suse.de>
      Cc: Jeff Mahoney <jeffm@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      061b1bd3
  19. 25 Aug, 2008 1 commit
    • Linus Torvalds's avatar
      [module] Don't let gcc inline load_module() · ffb4ba76
      Linus Torvalds authored
      
      
      'load_module()' is a complex function that contains all the ELF section
      logic, and inlining it is utterly insane.  But gcc will do it, simply
      because there is only one call-site.  As a result, all the stack space
      that is allocated for all the work to load the module will still be
      active when we actually call the module init sequence, and the deep call
      chain makes stack overflows happen.
      
      And stack overflows are really hard to debug, because they not only
      corrupt random pages below the stack, but also corrupt the thread_info
      structure that is allocated under the stack.
      
      In this case, Alan Brunelle reported some crazy oopses at bootup, after
      loading the processor module that ends up doing complex ACPI stuff and
      has quite a deep callchain.  This should fix it, and is the sane thing
      to do regardless.
      
      Cc: Alan D. Brunelle <Alan.Brunelle@hp.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ffb4ba76
  20. 12 Aug, 2008 1 commit
    • Arjan van de Ven's avatar
      modules: extend initcall_debug functionality to the module loader · 59f9415f
      Arjan van de Ven authored
      
      
      The kernel has this really nice facility where if you put "initcall_debug"
      on the kernel commandline, it'll print which function it's going to
      execute just before calling an initcall, and then after the call completes
      it will
      
      1) print if it had an error code
      
      2) checks for a few simple bugs (like leaving irqs off)
      and
      
      3) print how long the init call took in milliseconds.
      
      While trying to optimize the boot speed of my laptop, I have been loving
      number 3 to figure out what to optimize...  ...  and then I wished that
      the same thing was done for module loading.
      
      This patch makes the module loader use this exact same functionality; it's
      a logical extension in my view (since modules are just sort of late
      binding initcalls anyway) and so far I've found it quite useful in finding
      where things are too slow in my boot.
      Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      59f9415f
  21. 27 Jul, 2008 2 commits
  22. 22 Jul, 2008 5 commits
    • Rusty Russell's avatar
      modules: Take a shortcut for checking if an address is in a module · 3a642e99
      Rusty Russell authored
      
      
      This patch keeps track of the boundaries of module allocation, in
      order to speed up module_text_address().
      
      Inspired by Arjan's version, which required arch-specific defines:
      
      	Various pieces of the kernel (lockdep, latencytop, etc) tend
      	to store backtraces, sometimes at a relatively high
      	frequency. In itself this isn't a big performance deal (after
      	all you're using diagnostics features), but there have been
      	some complaints from people who have over 100 modules loaded
      	that this is a tad too slow.
      
      	This is due to the new backtracer code which looks at every
      	slot on the stack to see if it's a kernel/module text address,
      	so that's 1024 slots.  1024 times 100 modules... that's a lot
      	of list walking.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      3a642e99
    • Denys Vlasenko's avatar
      module: turn longs into ints for module sizes · 2f0f2a33
      Denys Vlasenko authored
      
      
      This shrinks module.o and each *.ko file.
      
      And finally, structure members which hold length of module
      code (four such members there) and count of symbols
      are converted from longs to ints.
      
      We cannot possibly have a module where 32 bits won't
      be enough to hold such counts.
      
      For one, module loading checks module size for sanity
      before loading, so such insanely big module will fail
      that test first.
      Signed-off-by: default avatarDenys Vlasenko <vda.linux@googlemail.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      2f0f2a33
    • Denys Vlasenko's avatar
      Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs · f7f5b675
      Denys Vlasenko authored
      
      
      module.c and module.h conatains code for finding
      exported symbols which are declared with EXPORT_UNUSED_SYMBOL,
      and this code is compiled in even if CONFIG_UNUSED_SYMBOLS is not set
      and thus there can be no EXPORT_UNUSED_SYMBOLs in modules anyway
      (because EXPORT_UNUSED_SYMBOL(x) are compiled out to nothing then).
      
      This patch adds required #ifdefs.
      Signed-off-by: default avatarDenys Vlasenko <vda.linux@googlemail.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      f7f5b675
    • Rusty Russell's avatar
      module: generic each_symbol iterator function · dafd0940
      Rusty Russell authored
      
      
      Introduce an each_symbol() iterator to avoid duplicating the knowledge
      about the 5 different sections containing symbols.  Currently only
      used by find_symbol(), but will be used by symbol_put_addr() too.
      
      (Includes NULL ptr deref fix by Jiri Kosina <jkosina@suse.cz>)
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: Jiri Kosina <jkosina@suse.cz>
      dafd0940
    • Rusty Russell's avatar
      module: don't use stop_machine for waiting rmmod · da39ba5e
      Rusty Russell authored
      
      
      rmmod has a little-used "-w" option, meaning that instead of failing if the
      module is in use, it should block until the module becomes unused.
      
      In this case, we don't need to use stop_machine: Max Krasnyansky
      indicated that would be useful for SystemTap which loads/unloads new
      modules frequently.
      
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      da39ba5e
  23. 22 May, 2008 1 commit
    • Denis V. Lunev's avatar
      modules: proper cleanup of kobject without CONFIG_SYSFS · 34e4e2fe
      Denis V. Lunev authored
      
      
      kobject: '<NULL>' (ffffffffa0104050): is not initialized, yet kobject_put() is being called.
      ------------[ cut here ]------------
      WARNING: at /home/den/src/linux-netns26/lib/kobject.c:583 kobject_put+0x53/0x55()
      Modules linked in: ipv6 nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs ide_cd_mod cdrom button [last unloaded: pktgen]
      comm: rmmod Tainted: G        W 2.6.26-rc3 #585
      Call Trace:
        [<ffffffff802359ab>] warn_on_slowpath+0x58/0x7a
        [<ffffffff80236aca>] ? printk+0x67/0x69
        [<ffffffff80236aca>] ? printk+0x67/0x69
        [<ffffffff80324289>] kobject_put+0x53/0x55
        [<ffffffff8025e2ee>] free_module+0x87/0xfa
        [<ffffffff8025fee5>] sys_delete_module+0x178/0x1e1
        [<ffffffff804b1e70>] ? lockdep_sys_exit_thunk+0x35/0x67
        [<ffffffff804b1dff>] ? trace_hardirqs_on_thunk+0x35/0x3a
        [<ffffffff8020c0bb>] system_call_after_swapgs+0x7b/0x80
      ---[ end trace 8f5aafa7f6406cf8 ]---
      
      mod->mkobj.kobj is not initialized without CONFIG_SYSFS. Do not call
      kobject_put in this case.
      Signed-off-by: default avatarDenis V. Lunev <den@openvz.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      34e4e2fe