1. 31 Mar, 2010 1 commit
    • Steven Rostedt's avatar
      tracing: Show the lost events in the trace_pipe output · bc21b478
      Steven Rostedt authored
      
      
      Now that the ring buffer can keep track of where events are lost.
      Use this information to the output of trace_pipe:
      
             hackbench-3588  [001]  1326.701660: lock_acquire: ffffffff816591e0 read rcu_read_lock
             hackbench-3588  [001]  1326.701661: lock_acquire: ffff88003f4091f0 &(&dentry->d_lock)->rlock
             hackbench-3588  [001]  1326.701664: lock_release: ffff88003f4091f0 &(&dentry->d_lock)->rlock
      CPU:1 [LOST 673 EVENTS]
             hackbench-3588  [001]  1326.702711: kmem_cache_free: call_site=ffffffff81102b85 ptr=ffff880026d96738
             hackbench-3588  [001]  1326.702712: lock_release: ffff88003e1480a8 &mm->mmap_sem
             hackbench-3588  [001]  1326.702713: lock_acquire: ffff88003e1480a8 &mm->mmap_sem
      
      Even works with the function graph tracer:
      
       2) ! 170.098 us  |                                            }
       2)   4.036 us    |                                            rcu_irq_exit();
       2)   3.657 us    |                                            idle_cpu();
       2) ! 190.301 us  |                                          }
      CPU:2 [LOST 2196 EVENTS]
       2)   0.853 us    |                            } /* cancel_dirty_page */
       2)               |                            remove_from_page_cache() {
       2)   1.578 us    |                              _raw_spin_lock_irq();
       2)               |                              __remove_from_page_cache() {
      
      Note, it does not work with the iterator "trace" file, since it requires
      the use of consuming the page from the ring buffer to determine how many
      events were lost, which the iterator does not do.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      bc21b478
  2. 10 Mar, 2010 2 commits
    • Frederic Weisbecker's avatar
      perf: Drop the obsolete profile naming for trace events · 97d5a220
      Frederic Weisbecker authored
      
      
      Drop the obsolete "profile" naming used by perf for trace events.
      Perf can now do more than simple events counting, so generalize
      the API naming.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      97d5a220
    • Frederic Weisbecker's avatar
      perf: Take a hot regs snapshot for trace events · c530665c
      Frederic Weisbecker authored
      
      
      We are taking a wrong regs snapshot when a trace event triggers.
      Either we use get_irq_regs(), which gives us the interrupted
      registers if we are in an interrupt, or we use task_pt_regs()
      which gives us the state before we entered the kernel, assuming
      we are lucky enough to be no kernel thread, in which case
      task_pt_regs() returns the initial set of regs when the kernel
      thread was started.
      
      What we want is different. We need a hot snapshot of the regs,
      so that we can get the instruction pointer to record in the
      sample, the frame pointer for the callchain, and some other
      things.
      
      Let's use the new perf_fetch_caller_regs() for that.
      
      Comparison with perf record -e lock: -R -a -f -g
      Before:
      
              perf  [kernel]                   [k] __do_softirq
                     |
                     --- __do_softirq
                        |
                        |--55.16%-- __open
                        |
                         --44.84%-- __write_nocancel
      
      After:
      
                  perf  [kernel]           [k] perf_tp_event
                     |
                     --- perf_tp_event
                        |
                        |--41.07%-- lock_acquire
                        |          |
                        |          |--39.36%-- _raw_spin_lock
                        |          |          |
                        |          |          |--7.81%-- hrtimer_interrupt
                        |          |          |          smp_apic_timer_interrupt
                        |          |          |          apic_timer_interrupt
      
      The old case was producing unreliable callchains. Now having
      right frame and instruction pointers, we have the trace we
      want.
      
      Also syscalls and kprobe events already have the right regs,
      let's use them instead of wasting a retrieval.
      
      v2: Follow the rename perf_save_regs() -> perf_fetch_caller_regs()
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Archs <linux-arch@vger.kernel.org>
      c530665c
  3. 28 Jan, 2010 1 commit
    • Xiao Guangrong's avatar
      perf: Factorize trace events raw sample buffer operations · 430ad5a6
      Xiao Guangrong authored
      
      
      Introduce ftrace_perf_buf_prepare() and ftrace_perf_buf_submit() to
      gather the common code that operates on raw events sampling buffer.
      This cleans up redundant code between regular trace events, syscall
      events and kprobe events.
      
      Changelog v1->v2:
      - Rename function name as per Masami and Frederic's suggestion
      - Add __kprobes for ftrace_perf_buf_prepare() and make
        ftrace_perf_buf_submit() inline as per Masami's suggestion
      - Export ftrace_perf_buf_prepare since modules will use it
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <4B60E92D.9000808@cn.fujitsu.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      430ad5a6
  4. 06 Jan, 2010 2 commits
    • Lai Jiangshan's avatar
      tracing: Remove show_format and related macros from TRACE_EVENT · 0fa0edaf
      Lai Jiangshan authored
      
      
      The previous patches added the use of print_fmt string and changes
      the trace_define_field() function to also create the fields and
      format output for the event format files.
      
         text	   data	    bss	    dec	    hex	filename
      5857201	1355780	9336808	16549789	 fc879d	vmlinux
      5884589	1351684	9337896	16574169	 fce6d9	vmlinux-orig
      
      The above shows the size of the vmlinux after this patch set
      compared to the vmlinux-orig which is before the patch set.
      
      This saves us 27k on text, 1k on bss and adds just 4k of data.
      
      The total savings of 24k in size.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4B273D4D.40604@cn.fujitsu.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      0fa0edaf
    • Lai Jiangshan's avatar
      tracing: Add print_fmt field · 509e760c
      Lai Jiangshan authored
      
      
      This is part of a patch set that removes the show_format method
      in the ftrace event macros.
      
      The print_fmt field is added to hold the string that shows
      the print_fmt in the event format files. This patch only adds
      the field but it is currently not used. Later patches will use
      this field to enable us to remove the show_format field
      and function.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4B273D3E.2000704@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      509e760c
  5. 28 Dec, 2009 1 commit
    • Li Zefan's avatar
      perf events: Remove CONFIG_EVENT_PROFILE · 07b139c8
      Li Zefan authored
      
      
      Quoted from Ingo:
      
      | This reminds me - i think we should eliminate CONFIG_EVENT_PROFILE -
      | it's an unnecessary Kconfig complication. If both PERF_EVENTS and
      | EVENT_TRACING is enabled we should expose generic tracepoints.
      |
      | Nor is it limited to event 'profiling', so it has become a misnomer as
      | well.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <4B2F1557.2050705@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      07b139c8
  6. 13 Dec, 2009 3 commits
  7. 09 Dec, 2009 1 commit
    • Steven Rostedt's avatar
      tracing: Buffer the output of seq_file in case of filled buffer · a63ce5b3
      Steven Rostedt authored
      
      
      If the seq_read fills the buffer it will call s_start again on the next
      itertation with the same position. This causes a problem with the
      function_graph tracer because it consumes the iteration in order to
      determine leaf functions.
      
      What happens is that the iterator stores the entry, and the function
      graph plugin will look at the next entry. If that next entry is a return
      of the same function and task, then the function is a leaf and the
      function_graph plugin calls ring_buffer_read which moves the ring buffer
      iterator forward (the trace iterator still points to the function start
      entry).
      
      The copying of the trace_seq to the seq_file buffer will fail if the
      seq_file buffer is full. The seq_read will not show this entry.
      The next read by userspace will cause seq_read to again call s_start
      which will reuse the trace iterator entry (the function start entry).
      But the function return entry was already consumed. The function graph
      plugin will think that this entry is a nested function and not a leaf.
      
      To solve this, the trace code now checks the return status of the
      seq_printf (trace_print_seq). If the writing to the seq_file buffer
      fails, we set a flag in the iterator (leftover) and we do not reset
      the trace_seq buffer. On the next call to s_start, we check the leftover
      flag, and if it is set, we just reuse the trace_seq buffer and do not
      call into the plugin print functions.
      
      Before this patch:
      
       2)               |      fput() {
       2)               |        __fput() {
       2)   0.550 us    |          inotify_inode_queue_event();
       2)               |          __fsnotify_parent() {
       2)   0.540 us    |          inotify_dentry_parent_queue_event();
      
      After the patch:
      
       2)               |      fput() {
       2)               |        __fput() {
       2)   0.550 us    |          inotify_inode_queue_event();
       2)   0.548 us    |          __fsnotify_parent();
       2)   0.540 us    |          inotify_dentry_parent_queue_event();
      
      [
        Updated the patch to fix a missing return 0 from the trace_print_seq()
        stub when CONFIG_TRACING is disabled.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      ]
      Reported-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      a63ce5b3
  8. 22 Nov, 2009 1 commit
    • Frederic Weisbecker's avatar
      tracing: Use the perf recursion protection from trace event · ce71b9df
      Frederic Weisbecker authored
      
      
      When we commit a trace to perf, we first check if we are
      recursing in the same buffer so that we don't mess-up the buffer
      with a recursing trace. But later on, we do the same check from
      perf to avoid commit recursion. The recursion check is desired
      early before we touch the buffer but we want to do this check
      only once.
      
      Then export the recursion protection from perf and use it from
      the trace events before submitting a trace.
      
      v2: Put appropriate Reported-by tag
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <1258864015-10579-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ce71b9df
  9. 08 Nov, 2009 1 commit
    • Frederic Weisbecker's avatar
      tracing, perf_events: Protect the buffer from recursion in perf · 444a2a3b
      Frederic Weisbecker authored
      
      
      While tracing using events with perf, if one enables the
      lockdep:lock_acquire event, it will infect every other perf
      trace events.
      
      Basically, you can enable whatever set of trace events through
      perf but if this event is part of the set, the only result we
      can get is a long list of lock_acquire events of rcu read lock,
      and only that.
      
      This is because of a recursion inside perf.
      
      1) When a trace event is triggered, it will fill a per cpu
         buffer and submit it to perf.
      
      2) Perf will commit this event but will also protect some data
         using rcu_read_lock
      
      3) A recursion appears: rcu_read_lock triggers a lock_acquire
         event that will fill the per cpu event and then submit the
         buffer to perf.
      
      4) Perf detects a recursion and ignores it
      
      5) Perf continues its work on the previous event, but its buffer
         has been overwritten by the lock_acquire event, it has then
         been turned into a lock_acquire event of rcu read lock
      
      Such scenario also happens with lock_release with
      rcu_read_unlock().
      
      We could turn the rcu_read_lock() into __rcu_read_lock() to drop
      the lock debugging from perf fast path, but that would make us
      lose the rcu debugging and that doesn't prevent from other
      possible kind of recursion from perf in the future.
      
      This patch adds a recursion protection based on a counter on the
      perf trace per cpu buffers to solve the problem.
      
      -v2: Fixed lost whitespace, added reviewed-by tag
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <1257477185-7838-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      444a2a3b
  10. 15 Oct, 2009 1 commit
  11. 17 Sep, 2009 2 commits
    • Frederic Weisbecker's avatar
      tracing: Allocate the ftrace event profile buffer dynamically · 20ab4425
      Frederic Weisbecker authored
      
      
      Currently the trace event profile buffer is allocated in the stack. But
      this may be too much for the stack, as the events can have large
      statically defined field size and can also grow with dynamic arrays.
      
      Allocate two per cpu buffer for all profiled events. The first cpu
      buffer is used to host every non-nmi context traces. It is protected
      by disabling the interrupts while writing and committing the trace.
      
      The second buffer is reserved for nmi. So that there is no race between
      them and the first buffer.
      
      The whole write/commit section is rcu protected because we release
      these buffers while deactivating the last profiling trace event.
      
      v2: Move the buffers from trace_event to be global, as pointed by
          Steven Rostedt.
      
      v3: Fix the syscall events to handle the profiling buffer races
          by disabling interrupts, now that the buffers are globals.
      Suggested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      20ab4425
    • Frederic Weisbecker's avatar
      tracing: Factorize the events profile accounting · e5e25cf4
      Frederic Weisbecker authored
      
      
      Factorize the events enabling accounting in a common tracing core
      helper. This reduces the size of the profile_enable() and
      profile_disable() callbacks for each trace events.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      e5e25cf4
  12. 12 Sep, 2009 1 commit
  13. 11 Sep, 2009 2 commits
    • Steven Rostedt's avatar
      tracing: add lock depth to entries · 637e7e86
      Steven Rostedt authored
      
      
      This patch adds the lock depth of the big kernel lock to the generic
      entry header. This way we can see the depth of the lock and help
      in removing the BKL.
      
      Example:
      
       #                  _------=> CPU#
       #                 / _-----=> irqs-off
       #                | / _----=> need-resched
       #                || / _---=> hardirq/softirq
       #                ||| / _--=> preempt-depth
       #                |||| /_--=> lock-depth
       #                |||||/     delay
       #  cmd     pid   |||||| time  |   caller
       #     \   /      ||||||   \   |   /
         <idle>-0       2.N..3 5902255250us+: lock_acquire: read rcu_read_lock
         <idle>-0       2.N..3 5902255253us+: lock_release: rcu_read_lock
         <idle>-0       2dN..3 5902255257us+: lock_acquire: xtime_lock
         <idle>-0       2dN..4 5902255259us : lock_acquire: clocksource_lock
         <idle>-0       2dN..4 5902255261us+: lock_release: clocksource_lock
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      637e7e86
    • Steven Rostedt's avatar
      tracing: move tgid out of generic entry and into userstack · 48659d31
      Steven Rostedt authored
      
      
      The userstack trace required the recording of the tgid entry.
      Unfortunately, it was added to the generic entry where it wasted
      4 bytes of every entry and was only used by one entry.
      
      This patch moves it out of the generic field and moves it into the
      only user (userstack_entry).
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      48659d31
  14. 04 Sep, 2009 1 commit
    • Steven Rostedt's avatar
      tracing: pass around ring buffer instead of tracer · e77405ad
      Steven Rostedt authored
      
      
      The latency tracers (irqsoff and wakeup) can swap trace buffers
      on the fly. If an event is happening and has reserved data on one of
      the buffers, and the latency tracer swaps the global buffer with the
      max buffer, the result is that the event may commit the data to the
      wrong buffer.
      
      This patch changes the API to the trace recording to be recieve the
      buffer that was used to reserve a commit. Then this buffer can be passed
      in to the commit.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      e77405ad
  15. 31 Aug, 2009 1 commit
    • Li Zefan's avatar
      tracing/filters: Defer pred allocation · 8e254c1d
      Li Zefan authored
      
      
      init_preds() allocates about 5392 bytes of memory (on x86_32) for
      a TRACE_EVENT. With my config, at system boot total memory occupied
      is:
      
      	5392 * (642 + 15) == 3459KB
      
      642 == cat available_events | wc -l
      15 == number of dirs in events/ftrace
      
      That's quite a lot, so we'd better defer memory allocation util
      it's needed, that's when filter is used.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      LKML-Reference: <4A9B8EA5.6020700@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8e254c1d
  16. 26 Aug, 2009 2 commits
    • Frederic Weisbecker's avatar
      tracing: Restore the const qualifier for field names and types definition · aeaeae11
      Frederic Weisbecker authored
      
      
      Restore the const qualifier in field's name and type parameters of
      trace_define_field that was lost while solving a conflict.
      
      Fields names and types are defined as builtin constant strings in
      static TRACE_EVENTs. But kprobes allocates these dynamically.
      
      That said, we still want to always pass these strings as const char *
      in trace_define_fields() to avoid any further accidental writes on
      the pointed strings.
      Reported-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      aeaeae11
    • Masami Hiramatsu's avatar
      tracing: Ftrace dynamic ftrace_event_call support · bd1a5c84
      Masami Hiramatsu authored
      Add dynamic ftrace_event_call support to ftrace. Trace engines can add
      new ftrace_event_call to ftrace on the fly. Each operator function of
      the call takes an ftrace_event_call data structure as an argument,
      because these functions may be shared among several ftrace_event_calls.
      
      Changes from v13:
       - Define remove_subsystem_dir() always (revirt a2ca5e03
      
      ), because
         trace_remove_event_call() uses it.
       - Modify syscall tracer because of ftrace_event_call change.
      
      [fweisbec@gmail.com: Fixed conflict against latest tracing/core]
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      LKML-Reference: <20090813203453.31965.71901.stgit@localhost.localdomain>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      bd1a5c84
  17. 25 Aug, 2009 2 commits
    • Li Zefan's avatar
      tracing/filters: Support filtering for char * strings · 87a342f5
      Li Zefan authored
      
      
      Usually, char * entries are dangerous in traces because the string
      can be released whereas a pointer to it can still wait to be read from
      the ring buffer.
      
      But sometimes we can assume it's safe, like in case of RO data
      (eg: __file__ or __line__, used in bkl trace event). If these RO data
      are in a module and so is the call to the trace event, then it's safe,
      because the ring buffer will be flushed once this module get unloaded.
      
      To allow char * to be treated as a string:
      
      	TRACE_EVENT(...,
      
      		TP_STRUCT__entry(
      			__field_ext(const char *, name, FILTER_PTR_STRING)
      			...
      		)
      
      		...
      	);
      
      The filtering will not dereference "char *" unless the developer
      explicitly sets FILTER_PTR_STR in __field_ext.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A7B9287.90205@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      87a342f5
    • Li Zefan's avatar
      tracing/filters: Add __field_ext() to TRACE_EVENT · 43b51ead
      Li Zefan authored
      
      
      Add __field_ext(), so a field can be assigned to a specific
      filter_type, which matches a corresponding filter function.
      
      For example, a later patch will allow this:
      	__field_ext(const char *, str, FILTER_PTR_STR);
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A7B9272.60507095
      
      @cn.fujitsu.com>
      
      [
        Fixed a -1 to FILTER_OTHER
        Forward ported to latest kernel.
      ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      43b51ead
  18. 19 Aug, 2009 3 commits
    • Li Zefan's avatar
      tracing/syscalls: Add filtering support · 540b7b8d
      Li Zefan authored
      
      
      Add filtering support for syscall events:
      
       # echo 'mode == 0666' > events/syscalls/sys_enter_open
       # echo 'ret == 0' > events/syscalls/sys_exit_open
       # echo 1 > events/syscalls/sys_enter_open
       # echo 1 > events/syscalls/sys_exit_open
       # cat trace
       ...
         modprobe-3084 [001] 117.463140: sys_open(filename: 917d3e8, flags: 0, mode: 1b6)
         modprobe-3084 [001] 117.463176: sys_open -> 0x0
             less-3086 [001] 117.510455: sys_open(filename: 9c6bdb8, flags: 8000, mode: 1b6)
         sendmail-2574 [001] 122.145840: sys_open(filename: b807a365, flags: 0, mode: 1b6)
       ...
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAFCB.1040006@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      540b7b8d
    • Li Zefan's avatar
      tracing/events: Add trace_define_common_fields() · e647d6b3
      Li Zefan authored
      
      
      Extract duplicate code. Also prepare for the later patch.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAFB8.1010304@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e647d6b3
    • Li Zefan's avatar
      tracing/events: Add ftrace_event_call param to define_fields() · 14be96c9
      Li Zefan authored
      
      
      This parameter is needed by syscall events to add define_fields()
      handler.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAF90.6060801@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      14be96c9
  19. 11 Aug, 2009 2 commits
    • Frederic Weisbecker's avatar
      tracing: Add ftrace event call parameter to its field descriptor handler · e8f9f4d7
      Frederic Weisbecker authored
      
      
      Add the struct ftrace_event_call as a parameter of its show_format()
      callback. This way we can use it from the syscall trace events to
      retrieve the syscall name from the ftrace event call parameter and
      describe its fields using the syscalls metadata.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      e8f9f4d7
    • Jason Baron's avatar
      tracing: Add ftrace_event_call void * 'data' field · 69fd4f0e
      Jason Baron authored
      
      
      add an optional void * pointer to 'ftrace_event_call' that is
      passed in for regfunc and unregfunc.
      
      This prepares for syscall tracepoints creation by passing the name of
      the syscall we want to trace and then retrieve its number through our
      arch syscall table.
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      69fd4f0e
  20. 09 Aug, 2009 1 commit
    • Frederic Weisbecker's avatar
      perf_counter: Fix/complete ftrace event records sampling · f413cdb8
      Frederic Weisbecker authored
      
      
      This patch implements the kernel side support for ftrace event
      record sampling.
      
      A new counter sampling attribute is added:
      
         PERF_SAMPLE_TP_RECORD
      
      which requests ftrace events record sampling. In this case
      if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
      fires, we emit the tracepoint binary record to the
      perfcounter event buffer, as a sample.
      
      Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
      record:
      
       perf record -f -F 1 -a -e workqueue:workqueue_execution
       perf report -D
      
       0x21e18 [0x48]: event: 9
       .
       . ... raw event: size 72 bytes
       .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
       .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
       .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
       .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
       .  0040:  e0 b1 31 81 ff ff ff ff                          .......
      .
      0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
      
      The raw ftrace binary record starts at offset 0020.
      
      Translation:
      
       struct trace_entry {
      	type		= 0x2b = 43;
      	flags		= 1;
      	preempt_count	= 2;
      	pid		= 0xa = 10;
      	tgid		= 0xa = 10;
       }
      
       thread_comm = "events/1"
       thread_pid  = 0xa = 10;
       func	    = 0xffffffff8131b1e0 = flush_to_ldisc()
      
      What will come next?
      
       - Userspace support ('perf trace'), 'flight data recorder' mode
         for perf trace, etc.
      
       - The unconditional copy from the profiling callback brings
         some costs however if someone wants no such sampling to
         occur, and needs to be fixed in the future. For that we need
         to have an instant access to the perf counter attribute.
         This is a matter of a flag to add in the struct ftrace_event.
      
       - Take care of the events recursivity! Don't ever try to record
         a lock event for example, it seems some locking is used in
         the profiling fast path and lead to a tracing recursivity.
         That will be fixed using raw spinlock or recursivity
         protection.
      
       - [...]
      
       - Profit! :-)
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f413cdb8
  21. 05 Aug, 2009 1 commit
  22. 20 Jul, 2009 1 commit
    • Li Zefan's avatar
      tracing/filters: improve subsystem filter · 1f9963cb
      Li Zefan authored
      
      
      Currently a subsystem filter should be applicable to all events
      under the subsystem, and if it failed, all the event filters
      will be cleared. Those behaviors make subsys filter much less
      useful:
      
        # echo 'vec == 1' > irq/softirq_entry/filter
        # echo 'irq == 5' > irq/filter
        bash: echo: write error: Invalid argument
        # cat irq/softirq_entry/filter
        none
      
      I'd expect it set the filter for irq_handler_entry/exit, and
      not touch softirq_entry/exit.
      
      The basic idea is, try to see if the filter can be applied
      to which events, and then just apply to the those events:
      
        # echo 'vec == 1' > softirq_entry/filter
        # echo 'irq == 5' > filter
        # cat irq_handler_entry/filter
        irq == 5
        # cat softirq_entry/filter
        vec == 1
      
      Changelog for v2:
      - do some cleanups to address Frederic's comments.
      Inspired-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A63D485.7030703@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      1f9963cb
  23. 01 Jun, 2009 1 commit
  24. 26 May, 2009 2 commits
    • Steven Rostedt's avatar
      tracing: add __print_symbolic to trace events · 0f4fc29d
      Steven Rostedt authored
      
      
      This patch adds __print_symbolic which is similar to __print_flags but
      works for an enumeration type instead. That is, there is only a one to one
      mapping between the values and the symbols. When a match is made, then
      it is printed, otherwise the hex value is outputed.
      
      [ Impact: add interface for showing symbol names in events ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      0f4fc29d
    • Steven Rostedt's avatar
      tracing: add __print_flags for events · be74b73a
      Steven Rostedt authored
      
      
      Developers have been asking for the ability in the ftrace event tracer
      to display names of bits in a flags variable.
      
      Instead of printing out c2, it would be easier to read FOO|BAR|GOO,
      assuming that FOO is bit 1, BAR is bit 6 and GOO is bit 7.
      
      Some examples where this would be useful are the state flags in a context
      switch, kmalloc flags, and even permision flags in accessing files.
      
      [
        v2 changes include:
      
        Frederic Weisbecker's idea of using a mask instead of bits,
        thus we can output GFP_KERNEL instead of GPF_WAIT|GFP_IO|GFP_FS.
      
        Li Zefan's idea of allowing the caller of __print_flags to add their
        own delimiter (or no delimiter) where we can get for file permissions
        rwx instead of r|w|x.
      ]
      
      [
        v3 changes:
      
         Christoph Hellwig's idea of using an array instead of va_args.
      ]
      
      [ Impact: better displaying of flags in trace output ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      be74b73a
  25. 08 May, 2009 1 commit
  26. 06 May, 2009 1 commit
  27. 29 Apr, 2009 2 commits
    • Tom Zanussi's avatar
      tracing/filters: a better event parser · 8b372562
      Tom Zanussi authored
      
      
      Replace the current event parser hack with a better one.  Filters are
      no longer specified predicate by predicate, but all at once and can
      use parens and any of the following operators:
      
      numeric fields:
      
      ==, !=, <, <=, >, >=
      
      string fields:
      
      ==, !=
      
      predicates can be combined with the logical operators:
      
      &&, ||
      
      examples:
      
      "common_preempt_count > 4" > filter
      
      "((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
      
      If there was an error, the erroneous string along with an error
      message can be seen by looking at the filter e.g.:
      
      ((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
      ^
      parse_error: Field not found
      
      Currently the caret for an error always appears at the beginning of
      the filter; a real position should be used, but the error message
      should be useful even without it.
      
      To clear a filter, '0' can be written to the filter file.
      
      Filters can also be set or cleared for a complete subsystem by writing
      the same filter as would be written to an individual event to the
      filter file at the root of the subsytem.  Note however, that if any
      event in the subsystem lacks a field specified in the filter being
      set, the set will fail and all filters in the subsytem are
      automatically cleared.  This change from the previous version was made
      because using only the fields that happen to exist for a given event
      would most likely result in a meaningless filter.
      
      Because the logical operators are now implemented as predicates, the
      maximum number of predicates in a filter was increased from 8 to 16.
      
      [ Impact: add new, extended trace-filter implementation ]
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905899.6416.121.camel@tropicana>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8b372562
    • Tom Zanussi's avatar
      tracing/filters: distinguish between signed and unsigned fields · a118e4d1
      Tom Zanussi authored
      
      
      The new filter comparison ops need to be able to distinguish between
      signed and unsigned field types, so add an is_signed flag/param to the
      event field struct/trace_define_fields().  Also define a simple macro,
      is_signed_type() to determine the signedness at compile time, used in the
      trace macros.  If the is_signed_type() macro won't work with a specific
      type, a new slightly modified version of TRACE_FIELD() called
      TRACE_FIELD_SIGN(), allows the signedness to be set explicitly.
      
      [ Impact: extend trace-filter code for new feature ]
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905893.6416.120.camel@tropicana>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a118e4d1