Skip to content
Snippets Groups Projects
  1. Jul 26, 2011
  2. Jul 21, 2011
  3. Jul 01, 2011
  4. May 31, 2011
    • Robert Richter's avatar
      oprofile: Fix locking dependency in sync_start() · 130c5ce7
      Robert Richter authored
      
      This fixes the A->B/B->A locking dependency, see the warning below.
      
      The function task_exit_notify() is called with (task_exit_notifier)
      .rwsem set and then calls sync_buffer() which locks buffer_mutex. In
      sync_start() the buffer_mutex was set to prevent notifier functions to
      be started before sync_start() is finished. But when registering the
      notifier, (task_exit_notifier).rwsem is locked too, but now in
      different order than in sync_buffer(). In theory this causes a locking
      dependency, what does not occur in practice since task_exit_notify()
      is always called after the notifier is registered which means the lock
      is already released.
      
      However, after checking the notifier functions it turned out the
      buffer_mutex in sync_start() is unnecessary. This is because
      sync_buffer() may be called from the notifiers even if sync_start()
      did not finish yet, the buffers are already allocated but empty. No
      need to protect this with the mutex.
      
      So we fix this theoretical locking dependency by removing buffer_mutex
      in sync_start(). This is similar to the implementation before commit:
      
       750d857c oprofile: fix crash when accessing freed task structs
      
      which introduced the locking dependency.
      
      Lockdep warning:
      
      oprofiled/4447 is trying to acquire lock:
       (buffer_mutex){+.+...}, at: [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]
      
      but task is already holding lock:
       ((task_exit_notifier).rwsem){++++..}, at: [<ffffffff81058026>] __blocking_notifier_call_chain+0x39/0x67
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 ((task_exit_notifier).rwsem){++++..}:
             [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
             [<ffffffff81463a2b>] down_write+0x44/0x67
             [<ffffffff810581c0>] blocking_notifier_chain_register+0x52/0x8b
             [<ffffffff8105a6ac>] profile_event_register+0x2d/0x2f
             [<ffffffffa00013c1>] sync_start+0x47/0xc6 [oprofile]
             [<ffffffffa00001bb>] oprofile_setup+0x60/0xa5 [oprofile]
             [<ffffffffa00014e3>] event_buffer_open+0x59/0x8c [oprofile]
             [<ffffffff810cd3b9>] __dentry_open+0x1eb/0x308
             [<ffffffff810cd59d>] nameidata_to_filp+0x60/0x67
             [<ffffffff810daad6>] do_last+0x5be/0x6b2
             [<ffffffff810dbc33>] path_openat+0xc7/0x360
             [<ffffffff810dbfc5>] do_filp_open+0x3d/0x8c
             [<ffffffff810ccfd2>] do_sys_open+0x110/0x1a9
             [<ffffffff810cd09e>] sys_open+0x20/0x22
             [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b
      
      -> #0 (buffer_mutex){+.+...}:
             [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
             [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
             [<ffffffff814634f0>] mutex_lock_nested+0x63/0x309
             [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]
             [<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile]
             [<ffffffff81467b96>] notifier_call_chain+0x37/0x63
             [<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67
             [<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16
             [<ffffffff8105a718>] profile_task_exit+0x1a/0x1c
             [<ffffffff81039e8f>] do_exit+0x2a/0x6fc
             [<ffffffff8103a5e4>] do_group_exit+0x83/0xae
             [<ffffffff8103a626>] sys_exit_group+0x17/0x1b
             [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b
      
      other info that might help us debug this:
      
      1 lock held by oprofiled/4447:
       #0:  ((task_exit_notifier).rwsem){++++..}, at: [<ffffffff81058026>] __blocking_notifier_call_chain+0x39/0x67
      
      stack backtrace:
      Pid: 4447, comm: oprofiled Not tainted 2.6.39-00007-gcf4d8d4 #10
      Call Trace:
       [<ffffffff81063193>] print_circular_bug+0xae/0xbc
       [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
       [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
       [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
       [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
       [<ffffffff81062627>] ? mark_lock+0x42f/0x552
       [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
       [<ffffffff814634f0>] mutex_lock_nested+0x63/0x309
       [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
       [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]
       [<ffffffff81058026>] ? __blocking_notifier_call_chain+0x39/0x67
       [<ffffffff81058026>] ? __blocking_notifier_call_chain+0x39/0x67
       [<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile]
       [<ffffffff81467b96>] notifier_call_chain+0x37/0x63
       [<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67
       [<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16
       [<ffffffff8105a718>] profile_task_exit+0x1a/0x1c
       [<ffffffff81039e8f>] do_exit+0x2a/0x6fc
       [<ffffffff81465031>] ? retint_swapgs+0xe/0x13
       [<ffffffff8103a5e4>] do_group_exit+0x83/0xae
       [<ffffffff8103a626>] sys_exit_group+0x17/0x1b
       [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b
      
      Reported-by: default avatarMarcin Slusarz <marcin.slusarz@gmail.com>
      Cc: Carl Love <carll@us.ibm.com>
      Cc: <stable@kernel.org> # .36+
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      130c5ce7
    • Robert Richter's avatar
      oprofile: Free potentially owned tasks in case of errors · 6ac6519b
      Robert Richter authored
      
      After registering the task free notifier we possibly have tasks in our
      dying_tasks list. Free them after unregistering the notifier in case
      of an error.
      
      Cc: <stable@kernel.org> # .36+
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      6ac6519b
  5. May 24, 2011
  6. Feb 15, 2011
    • Robert Richter's avatar
      oprofile, s390: Rework hwsampler implementation · a0d76247
      Robert Richter authored
      
      This patch is a rework of the hwsampler oprofile implementation that
      has been applied recently. Now there are less non-architectural
      changes. The only changes are:
      
      * introduction of oprofile_add_ext_hw_sample(), and
      * removal of section attributes of oprofile_timer_init/_exit().
      
      To setup hwsampler for oprofile we need to modify start()/stop()
      callbacks and additional hwsampler control files in oprofilefs. We do
      not reinitialize the timer or hwsampler mode by restarting calling
      init/exit() anymore, instead hwsampler_running is used to switch the
      mode directly in oprofile_hwsampler_start/_stop(). For locking reasons
      there is also hwsampler_file that reflects the value in oprofilefs.
      
      The overall diffstat of the oprofile s390 hwsampler implemenation
      shows the low impact to non-architectural code:
      
       arch/Kconfig                         |    3 +
       arch/s390/Kconfig                    |    1 +
       arch/s390/oprofile/Makefile          |    2 +-
       arch/s390/oprofile/hwsampler.c       | 1256 ++++++++++++++++++++++++++++++++++
       arch/s390/oprofile/hwsampler.h       |  113 +++
       arch/s390/oprofile/hwsampler_files.c |  162 +++++
       arch/s390/oprofile/init.c            |    6 +-
       drivers/oprofile/cpu_buffer.c        |   24 +-
       drivers/oprofile/timer_int.c         |    4 +-
       include/linux/oprofile.h             |    7 +
       10 files changed, 1567 insertions(+), 11 deletions(-)
      
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      a0d76247
    • Heinz Graalfs's avatar
      oprofile, s390: Enhance OProfile to support System zs hardware sampling feature · 997dbb49
      Heinz Graalfs authored
      
      OProfile is enhanced to export all files for controlling System z's
      hardware sampling, and to invoke hwsampler exported functions to
      initialize and use System z's hardware sampling.
      
      The patch invokes hwsampler_setup() during oprofile init and exports
      following hwsampler files under oprofilefs if hwsampler's setup
      succeeded:
      
      A new directory for hardware sampling based files
      
       /dev/oprofile/hwsampling/
      
      The userland daemon must explicitly write to the following files
      to disable (or enable) hardware based sampling
      
       /dev/oprofile/hwsampling/hwsampler
      
      to modify the actual sampling rate
      
       /dev/oprofile/hwsampling/hw_interval
      
      to modify the amount of sampling memory (measured in 4K pages)
      
       /dev/oprofile/hwsampling/hw_sdbt_blocks
      
      The following files are read only and show
      the possible minimum sampling rate
      
       /dev/oprofile/hwsampling/hw_min_interval
      
      the possible maximum sampling rate
      
       /dev/oprofile/hwsampling/hw_max_interval
      
      The patch splits the oprofile_timer_[init/exit] function so that it
      can be also called through user context (oprofilefs) to avoid kernel
      oops.
      
      Applied with following changes:
      * whitespace changes in Makefile and timer_int.c
      
      Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarMaran Pakkirisamy <maranp@linux.vnet.ibm.com>
      Signed-off-by: default avatarHeinz Graalfs <graalfs@linux.vnet.ibm.com>
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      997dbb49
    • Heinz Graalfs's avatar
      oprofile: Introduce new oprofile sample add function (oprofile_add_ext_hw_sample) · 54ebbe7b
      Heinz Graalfs authored
      
      This patch introduces a new oprofile sample add function
      (oprofile_add_ext_hw_sample) that can also take task_struct as an
      argument, which is used by the hwsampler kernel module when copying
      hardware samples to OProfile buffers.
      
      Applied with following changes:
      * removed #include <linux/module.h>
      * whitespace changes
      * removed conditional compilation (CONFIG_HAVE_HWSAMPLER)
      * modified order of functions
      * fix missing function definition in header file
      
      Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarMaran Pakkirisamy <maranp@linux.vnet.ibm.com>
      Signed-off-by: default avatarHeinz Graalfs <graalfs@linux.vnet.ibm.com>
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      54ebbe7b
  7. Oct 29, 2010
  8. Oct 25, 2010
    • Christoph Hellwig's avatar
      fs: do not assign default i_ino in new_inode · 85fe4025
      Christoph Hellwig authored
      
      Instead of always assigning an increasing inode number in new_inode
      move the call to assign it into those callers that actually need it.
      For now callers that need it is estimated conservatively, that is
      the call is added to all filesystems that do not assign an i_ino
      by themselves.  For a few more filesystems we can avoid assigning
      any inode number given that they aren't user visible, and for others
      it could be done lazily when an inode number is actually needed,
      but that's left for later patches.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      85fe4025
  9. Oct 15, 2010
    • Arnd Bergmann's avatar
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann authored
      
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
    • Robert Richter's avatar
      oprofile: make !CONFIG_PM function stubs static inline · cd254f29
      Robert Richter authored
      
      Make !CONFIG_PM function stubs static inline and remove section
      attribute.
      
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      cd254f29
    • Anand Gadiyar's avatar
      oprofile: fix linker errors · b3b3a9b6
      Anand Gadiyar authored
      
      Commit e9677b3c (oprofile, ARM: Use oprofile_arch_exit() to
      cleanup on failure) caused oprofile_perf_exit to be called
      in the cleanup path of oprofile_perf_init. The __exit tag
      for oprofile_perf_exit should therefore be dropped.
      
      The same has to be done for exit_driverfs as well, as this
      function is called from oprofile_perf_exit. Else, we get
      the following two linker errors.
      
        LD      .tmp_vmlinux1
      `oprofile_perf_exit' referenced in section `.init.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
      make: *** [.tmp_vmlinux1] Error 1
      
        LD      .tmp_vmlinux1
      `exit_driverfs' referenced in section `.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
      make: *** [.tmp_vmlinux1] Error 1
      
      Signed-off-by: default avatarAnand Gadiyar <gadiyar@ti.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      b3b3a9b6
    • Anand Gadiyar's avatar
      oprofile: include platform_device.h to fix build break · 277dd984
      Anand Gadiyar authored
      
      oprofile_perf.c needs to include platform_device.h
      Otherwise we get the following build break.
      
        CC      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.o
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:192: warning: 'struct platform_device' declared inside parameter list
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:192: warning: its scope is only this definition or declaration, which is probably not what you want
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:201: warning: 'struct platform_device' declared inside parameter list
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:210: error: variable 'oprofile_driver' has initializer but incomplete type
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:211: error: unknown field 'driver' specified in initializer
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:211: error: extra brace group at end of initializer
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:211: error: (near initialization for 'oprofile_driver')
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:213: warning: excess elements in struct initializer
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:213: warning: (near initialization for 'oprofile_driver')
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:214: error: unknown field 'resume' specified in initializer
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:214: warning: excess elements in struct initializer
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:214: warning: (near initialization for 'oprofile_driver')
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:215: error: unknown field 'suspend' specified in initializer
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:215: warning: excess elements in struct initializer
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:215: warning: (near initialization for 'oprofile_driver')
      arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c: In function 'init_driverfs':
      
      Signed-off-by: default avatarAnand Gadiyar <gadiyar@ti.com>
      Cc: Matt Fleming <matt@console-pimps.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      277dd984
  10. Oct 12, 2010
    • Robert Richter's avatar
      oprofile: disable write access to oprofilefs while profiler is running · 7df01d96
      Robert Richter authored
      
      Oprofile counters are setup when profiling is disabled. Thus, writing
      to oprofilefs has no immediate effect. Changes are updated only after
      oprofile is reenabled.
      
      To keep userland and kernel states synchronized, we now allow
      configuration of oprofile only if profiling is disabled.  In this case
      it checks if the profiler is running and then disables write access to
      oprofilefs by returning -EBUSY. The change should be backward
      compatible with current oprofile userland daemon.
      
      Acked-by: default avatarMaynard Johnson <maynardj@us.ibm.com>
      Cc: William Cohen <wcohen@redhat.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      7df01d96
  11. Oct 11, 2010
  12. Oct 04, 2010
  13. Aug 31, 2010
    • Will Deacon's avatar
      oprofile: don't call arch exit code from init code on failure · 979048e1
      Will Deacon authored
      
      oprofile_init calls oprofile_arch_init to initialise the architecture-specific
      backend code. If this backend code returns failure, oprofile_arch_exit is
      called immediately, making it difficult to allocate and free resources
      correctly.
      
      This patch removes the oprofile_arch_exit call from oprofile_init,
      meaning that all architectures must ensure that oprofile_arch_init
      cleans up any mess it's made before returning an error. As far as
      I can tell, this only affects the code for ARM.
      
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Matt Fleming <matt@console-pimps.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      979048e1
  14. Aug 25, 2010
    • Robert Richter's avatar
      oprofile: fix crash when accessing freed task structs · 750d857c
      Robert Richter authored
      
      This patch fixes a crash during shutdown reported below. The crash is
      caused by accessing already freed task structs. The fix changes the
      order for registering and unregistering notifier callbacks.
      
      All notifiers must be initialized before buffers start working. To
      stop buffer synchronization we cancel all workqueues, unregister the
      notifier callback and then flush all buffers. After all of this we
      finally can free all tasks listed.
      
      This should avoid accessing freed tasks.
      
      On 22.07.10 01:14:40, Benjamin Herrenschmidt wrote:
      
      > So the initial observation is a spinlock bad magic followed by a crash
      > in the spinlock debug code:
      >
      > [ 1541.586531] BUG: spinlock bad magic on CPU#5, events/5/136
      > [ 1541.597564] Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6d03
      >
      > Backtrace looks like:
      >
      >       spin_bug+0x74/0xd4
      >       ._raw_spin_lock+0x48/0x184
      >       ._spin_lock+0x10/0x24
      >       .get_task_mm+0x28/0x8c
      >       .sync_buffer+0x1b4/0x598
      >       .wq_sync_buffer+0xa0/0xdc
      >       .worker_thread+0x1d8/0x2a8
      >       .kthread+0xa8/0xb4
      >       .kernel_thread+0x54/0x70
      >
      > So we are accessing a freed task struct in the work queue when
      > processing the samples.
      
      Reported-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      750d857c
  15. Jul 26, 2010
  16. May 03, 2010
  17. Apr 23, 2010
    • Andi Kleen's avatar
      oprofile: remove double ring buffering · cb6e943c
      Andi Kleen authored
      
      oprofile used a double buffer scheme for its cpu event buffer
      to avoid races on reading with the old locked ring buffer.
      
      But that is obsolete now with the new ring buffer, so simply
      use a single buffer. This greatly simplifies the code and avoids
      a lot of sample drops on large runs, especially with call graph.
      
      Based on suggestions from Steven Rostedt
      
      For stable kernels from v2.6.32, but not earlier.
      
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: stable <stable@kernel.org>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      cb6e943c
  18. Mar 31, 2010
    • Steven Rostedt's avatar
      ring-buffer: Add place holder recording of dropped events · 66a8cb95
      Steven Rostedt authored
      
      Currently, when the ring buffer drops events, it does not record
      the fact that it did so. It does inform the writer that the event
      was dropped by returning a NULL event, but it does not put in any
      place holder where the event was dropped.
      
      This is not a trivial thing to add because the ring buffer mostly
      runs in overwrite (flight recorder) mode. That is, when the ring
      buffer is full, new data will overwrite old data.
      
      In a produce/consumer mode, where new data is simply dropped when
      the ring buffer is full, it is trivial to add the placeholder
      for dropped events. When there's more room to write new data, then
      a special event can be added to notify the reader about the dropped
      events.
      
      But in overwrite mode, any new write can overwrite events. A place
      holder can not be inserted into the ring buffer since there never
      may be room. A reader could also come in at anytime and miss the
      placeholder.
      
      Luckily, the way the ring buffer works, the read side can find out
      if events were lost or not, and how many events. Everytime a write
      takes place, if it overwrites the header page (the next read) it
      updates a "overrun" variable that keeps track of the number of
      lost events. When a reader swaps out a page from the ring buffer,
      it can record this number, perfom the swap, and then check to
      see if the number changed, and take the diff if it has, which would be
      the number of events dropped. This can be stored by the reader
      and returned to callers of the reader.
      
      Since the reader page swap will fail if the writer moved the head
      page since the time the reader page set up the swap, this gives room
      to record the overruns without worrying about races. If the reader
      sets up the pages, records the overrun, than performs the swap,
      if the swap succeeds, then the overrun variable has not been
      updated since the setup before the swap.
      
      For binary readers of the ring buffer, a flag is set in the header
      of each sub page (sub buffer) of the ring buffer. This flag is embedded
      in the size field of the data on the sub buffer, in the 31st bit (the size
      can be 32 or 64 bits depending on the architecture), but only 27
      bits needs to be used for the actual size (less actually).
      
      We could add a new field in the sub buffer header to also record the
      number of events dropped since the last read, but this will change the
      format of the binary ring buffer a bit too much. Perhaps this change can
      be made if the information on the number of events dropped is considered
      important enough.
      
      Note, the notification of dropped events is only used by consuming reads
      or peeking at the ring buffer. Iterating over the ring buffer does not
      keep this information because the necessary data is only available when
      a page swap is made, and the iterator does not swap out pages.
      
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      66a8cb95
  19. Mar 30, 2010
    • Tejun Heo's avatar
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo authored
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Guess-its-ok-by: default avatarChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  20. Mar 02, 2010
  21. Oct 29, 2009
    • Tejun Heo's avatar
      percpu: make percpu symbols in oprofile unique · b3e9f672
      Tejun Heo authored
      
      This patch updates percpu related symbols in oprofile such that percpu
      symbols are unique and don't clash with local symbols.  This serves
      two purposes of decreasing the possibility of global percpu symbol
      collision and allowing dropping per_cpu__ prefix from percpu symbols.
      
      * drivers/oprofile/cpu_buffer.c: s/cpu_buffer/op_cpu_buffer/
      
      Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
      which cause name clashes" patch.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarRobert Richter <robert.richter@amd.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      b3e9f672
  22. Oct 09, 2009
    • Robert Richter's avatar
      oprofile: warn on freeing event buffer too early · c0868934
      Robert Richter authored
      
      A race shouldn't happen since all workqueues or handlers are canceled
      or flushed before the event buffer is freed. A warning is triggered
      now if the buffer is freed too early.
      
      Also, this patch adds some comments about event buffer protection,
      reworks some code and adds code to clear buffer_pos during alloc and
      free of the event buffer.
      
      Cc: David Rientjes <rientjes@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      c0868934
    • David Rientjes's avatar
      oprofile: fix race condition in event_buffer free · 066b3aa8
      David Rientjes authored
      
      Looking at the 2.6.31-rc9 code, it appears there is a race condition
      in the event_buffer cleanup code path (shutdown). This could lead to
      kernel panic as some CPUs may be operating on the event buffer AFTER
      it has been freed. The attached patch solves the problem and makes
      sure CPUs check if the buffer is not NULL before they access it as
      some may have been spinning on the mutex while the buffer was being
      freed.
      
      The race may happen if the buffer is freed during pending reads. But
      it is not clear why there are races in add_event_entry() since all
      workqueues or handlers are canceled or flushed before the event buffer
      is freed.
      
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      066b3aa8
  23. Sep 23, 2009
  24. Sep 22, 2009
  25. Jul 20, 2009
Loading