1. 10 Aug, 2009 1 commit
  2. 09 Aug, 2009 2 commits
    • Frederic Weisbecker's avatar
      perf_counter: Fix/complete ftrace event records sampling · f413cdb8
      Frederic Weisbecker authored
      
      
      This patch implements the kernel side support for ftrace event
      record sampling.
      
      A new counter sampling attribute is added:
      
         PERF_SAMPLE_TP_RECORD
      
      which requests ftrace events record sampling. In this case
      if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
      fires, we emit the tracepoint binary record to the
      perfcounter event buffer, as a sample.
      
      Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
      record:
      
       perf record -f -F 1 -a -e workqueue:workqueue_execution
       perf report -D
      
       0x21e18 [0x48]: event: 9
       .
       . ... raw event: size 72 bytes
       .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
       .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
       .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
       .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
       .  0040:  e0 b1 31 81 ff ff ff ff                          .......
      .
      0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
      
      The raw ftrace binary record starts at offset 0020.
      
      Translation:
      
       struct trace_entry {
      	type		= 0x2b = 43;
      	flags		= 1;
      	preempt_count	= 2;
      	pid		= 0xa = 10;
      	tgid		= 0xa = 10;
       }
      
       thread_comm = "events/1"
       thread_pid  = 0xa = 10;
       func	    = 0xffffffff8131b1e0 = flush_to_ldisc()
      
      What will come next?
      
       - Userspace support ('perf trace'), 'flight data recorder' mode
         for perf trace, etc.
      
       - The unconditional copy from the profiling callback brings
         some costs however if someone wants no such sampling to
         occur, and needs to be fixed in the future. For that we need
         to have an instant access to the perf counter attribute.
         This is a matter of a flag to add in the struct ftrace_event.
      
       - Take care of the events recursivity! Don't ever try to record
         a lock event for example, it seems some locking is used in
         the profiling fast path and lead to a tracing recursivity.
         That will be fixed using raw spinlock or recursivity
         protection.
      
       - [...]
      
       - Profit! :-)
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f413cdb8
    • Peter Zijlstra's avatar
      perf_counter, ftrace: Fix perf_counter integration · 3a659305
      Peter Zijlstra authored
      
      
      Adds possible second part to the assign argument of TP_EVENT().
      
        TP_perf_assign(
      	__perf_count(foo);
      	__perf_addr(bar);
        )
      
      Which, when specified make the swcounter increment with @foo instead
      of the usual 1, and report @bar for PERF_SAMPLE_ADDR (data address
      associated with the event) when this triggers a counter overflow.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3a659305
  3. 10 Jun, 2009 1 commit
    • Steven Rostedt's avatar
      tracing: do not translate event helper macros in print format · 6ff9a64d
      Steven Rostedt authored
      
      
      By moving the macro that creates the print format code above the
      defining of the event macro helpers (__get_str, __print_symbolic,
      and __get_dynamic_array), we get a little cleaner print format.
      
      Instead of:
      
        (char *)((void *)REC + REC->__data_loc_name)
      
      we get:
      
         __get_str(name)
      
      Instead of:
      
         ({ static const struct trace_print_flags symbols[] = { { HI_SOFTIRQ, "HI" }, {
      
      we get:
      
         __print_symbolic(REC->vec, { HI_SOFTIRQ, "HI" }, {
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      6ff9a64d
  4. 03 Jun, 2009 1 commit
    • Steven Whitehouse's avatar
      tracing: fix multiple use of __print_flags and __print_symbolic · 56d8bd3f
      Steven Whitehouse authored
      
      
      Here is an updated patch to include the extra call to
      trace_seq_init() as requested. This is vs. the latest
      -tip tree and fixes the use of multiple __print_flags
      and __print_symbolic in a single tracer. Also tested
      to ensure its working now:
      
      mount.gfs2-2534  [000]   235.850587: gfs2_glock_queue: 8.7 glock 1:2 dequeue PR
      mount.gfs2-2534  [000]   235.850591: gfs2_demote_rq: 8.7 glock 1:0 demote EX to NL flags:DI
      mount.gfs2-2534  [000]   235.850591: gfs2_glock_queue: 8.7 glock 1:0 dequeue EX
      glock_workqueue-2529  [000]   235.850666: gfs2_glock_state_change: 8.7 glock 1:0 state EX => NL tgt:NL dmt:NL flags:lDpI
      glock_workqueue-2529  [000]   235.850672: gfs2_glock_put: 8.7 glock 1:0 state NL => IV flags:I
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      LKML-Reference: <1244037123.29604.603.camel@localhost.localdomain>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      56d8bd3f
  5. 01 Jun, 2009 3 commits
    • Li Zefan's avatar
      tracing/events: introduce __dynamic_array() · 7fcb7c47
      Li Zefan authored
      
      
      __string() is limited:
      
        - it's a char array, but we may want to define array with other types
        - a source string should be available, but we may just know the string size
      
      We introduce __dynamic_array() to break those limitations, and __string()
      becomes a wrapper of it. As a side effect, now __get_str() can be used
      in TP_fast_assign but not only TP_print.
      
      Take XFS for example, we have the string length in the dirent, but the
      string itself is not NULL-terminated, so __dynamic_array() can be used:
      
      TRACE_EVENT(xfs_dir2,
      	TP_PROTO(struct xfs_da_args *args),
      	TP_ARGS(args),
      
      	TP_STRUCT__entry(
      		__field(int, namelen)
      		__dynamic_array(char, name, args->namelen + 1)
      		...
      	),
      
      	TP_fast_assign(
      		char *name = __get_str(name);
      
      		if (args->namelen)
      			memcpy(name, args->name, args->namelen);
      		name[args->namelen] = '\0';
      
      		__entry->namelen = args->namelen;
      	),
      
      	TP_printk("name %.*s namelen %d",
      		  __entry->namelen ? __get_str(name) : NULL
      		  __entry->namelen)
      );
      
      [ Impact: allow defining dynamic size arrays ]
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A2384D2.3080403@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      7fcb7c47
    • Li Zefan's avatar
      tracing/events: put TP_fast_assign into braces · a9c1c3ab
      Li Zefan authored
      
      
      Currently TP_fast_assign has a limitation that we can't define local
      variables in it.
      
      Here's one use case when we introduce __dynamic_array():
      
      TP_fast_assign(
      	type *p = __get_dynamic_array(item);
      
      	foo(p);
      	bar(p);
      ),
      
      [ Impact: allow defining local variables in TP_fast_assign ]
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A2384B1.90100@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      a9c1c3ab
    • Li Zefan's avatar
      tracing/events: fix a typo in __string() format output · 6e25db44
      Li Zefan authored
      
      
      "tsize" should be "\tsize". Also remove the space before "__str_loc".
      
      Before:
       # cat tracing/events/irq/irq_handler_entry/format
              ...
              field:int irq;  offset:12;      size:4;
              field: __str_loc name;  offset:16;tsize:2;
              ...
      
      After:
       # cat tracing/events/irq/irq_handler_entry/format
      	...
              field:int irq;  offset:12;      size:4;
              field:__str_loc name;   offset:16;      size:2;
      	...
      
      [ Impact: standardize __string field description in events format file ]
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      6e25db44
  6. 27 May, 2009 1 commit
  7. 26 May, 2009 2 commits
    • Steven Rostedt's avatar
      tracing: add __print_symbolic to trace events · 0f4fc29d
      Steven Rostedt authored
      
      
      This patch adds __print_symbolic which is similar to __print_flags but
      works for an enumeration type instead. That is, there is only a one to one
      mapping between the values and the symbols. When a match is made, then
      it is printed, otherwise the hex value is outputed.
      
      [ Impact: add interface for showing symbol names in events ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      0f4fc29d
    • Steven Rostedt's avatar
      tracing: add __print_flags for events · be74b73a
      Steven Rostedt authored
      
      
      Developers have been asking for the ability in the ftrace event tracer
      to display names of bits in a flags variable.
      
      Instead of printing out c2, it would be easier to read FOO|BAR|GOO,
      assuming that FOO is bit 1, BAR is bit 6 and GOO is bit 7.
      
      Some examples where this would be useful are the state flags in a context
      switch, kmalloc flags, and even permision flags in accessing files.
      
      [
        v2 changes include:
      
        Frederic Weisbecker's idea of using a mask instead of bits,
        thus we can output GFP_KERNEL instead of GPF_WAIT|GFP_IO|GFP_FS.
      
        Li Zefan's idea of allowing the caller of __print_flags to add their
        own delimiter (or no delimiter) where we can get for file permissions
        rwx instead of r|w|x.
      ]
      
      [
        v3 changes:
      
         Christoph Hellwig's idea of using an array instead of va_args.
      ]
      
      [ Impact: better displaying of flags in trace output ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      be74b73a
  8. 25 May, 2009 1 commit
  9. 29 Apr, 2009 1 commit
    • Tom Zanussi's avatar
      tracing/filters: distinguish between signed and unsigned fields · a118e4d1
      Tom Zanussi authored
      
      
      The new filter comparison ops need to be able to distinguish between
      signed and unsigned field types, so add an is_signed flag/param to the
      event field struct/trace_define_fields().  Also define a simple macro,
      is_signed_type() to determine the signedness at compile time, used in the
      trace macros.  If the is_signed_type() macro won't work with a specific
      type, a new slightly modified version of TRACE_FIELD() called
      TRACE_FIELD_SIGN(), allows the signedness to be set explicitly.
      
      [ Impact: extend trace-filter code for new feature ]
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905893.6416.120.camel@tropicana>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a118e4d1
  10. 24 Apr, 2009 1 commit
  11. 22 Apr, 2009 3 commits
    • Frederic Weisbecker's avatar
      tracing/events: protect __get_str() · 6a74aa40
      Frederic Weisbecker authored
      
      
      The __get_str() macro is used in a code part then its content should be
      protected with parenthesis.
      
      [ Impact: make macro definition more robust ]
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      6a74aa40
    • Frederic Weisbecker's avatar
      tracing/events: provide string with undefined size support · 9cbf1176
      Frederic Weisbecker authored
      
      
      This patch provides the support for dynamic size strings on
      event tracing.
      
      The key concept is to use a structure with an ending char array field of
      undefined size and use such ability to allocate the minimal size on the
      ring buffer to make one or more string entries fit inside, as opposite
      to a fixed length strings with upper bound.
      
      The strings themselves are represented using fields which have an offset
      value from the beginning of the entry.
      
      This patch provides three new macros:
      
      __string(item, src)
      
      This one declares a string to the structure inside TP_STRUCT__entry.
      You need to provide the name of the string field and the source that will
      be copied inside.
      This will also add the dynamic size of the string needed for the ring
      buffer entry allocation.
      A stack allocated structure is used to temporarily store the offset
      of each strings, avoiding double calls to strlen() on each event
      insertion.
      
      __get_str(field)
      
      This one will give you a pointer to the string you have created. This
      is an abstract helper to resolve the absolute address given the field
      name which is a relative address from the beginning of the trace_structure.
      
      __assign_str(dst, src)
      
      Use this macro to automatically perform the string copy from src to
      dst. src must be a variable to assign and dst is the name of a __string
      field.
      
      Example on how to use it:
      
      TRACE_EVENT(my_event,
      	TP_PROTO(char *src1, char *src2),
      
      	TP_ARGS(src1, src2),
      	TP_STRUCT__entry(
      		__string(str1, src1)
      		__string(str2, src2)
      	),
      	TP_fast_assign(
      		__assign_str(str1, src1);
      		__assign_str(str2, src2);
      	),
      	TP_printk("%s %s", __get_str(src1), __get_str(src2))
      )
      
      Of course you can mix-up any __field or __array inside this
      TRACE_EVENT. The position of the __string or __assign_str
      doesn't matter.
      
      Changes in v2:
      
      Address the suggestion of Steven Rostedt: drop the opening_string() macro
      and redefine __ending_string() to get the size of the string to be copied
      instead of overwritting the whole ring buffer allocation.
      
      Changes in v3:
      
      Address other suggestions of Steven Rostedt and Peter Zijlstra with
      some changes: drop the __ending_string and the need to have only one
      string field.
      Use offsets instead of absolute addresses.
      
      [ Impact: allow more compact memory usage for string tracing ]
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      9cbf1176
    • Li Zefan's avatar
      tracing/events: make struct trace_entry->type to be int type · 7a4f453b
      Li Zefan authored
      
      
      struct trace_entry->type is unsigned char, while trace event's id is
      int type, thus for a event with id >= 256, it's entry->type is cast
      to (id % 256), and then we can't see the trace output of this event.
      
       # insmod trace-events-sample.ko
       # echo foo_bar > /mnt/tracing/set_event
       # cat /debug/tracing/events/trace-events-sample/foo_bar/id
       256
       # cat /mnt/tracing/trace_pipe
                 <...>-3548  [001]   215.091142: Unknown type 0
                 <...>-3548  [001]   216.089207: Unknown type 0
                 <...>-3548  [001]   217.087271: Unknown type 0
                 <...>-3548  [001]   218.085332: Unknown type 0
      
      [ Impact: fix output for trace events with id >= 256 ]
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49EEDB0E.5070207@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7a4f453b
  12. 17 Apr, 2009 1 commit
  13. 14 Apr, 2009 3 commits
    • Steven Rostedt's avatar
      tracing/events: add support for modules to TRACE_EVENT · 6d723736
      Steven Rostedt authored
      
      
      Impact: allow modules to add TRACE_EVENTS on load
      
      This patch adds the final hooks to allow modules to use the TRACE_EVENT
      macro. A notifier and a data structure are used to link the TRACE_EVENTs
      defined in the module to connect them with the ftrace event tracing system.
      
      It also adds the necessary automated clean ups to the trace events when a
      module is removed.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      6d723736
    • Steven Rostedt's avatar
      tracing/events: move the ftrace event tracing code to core · f42c85e7
      Steven Rostedt authored
      
      
      This patch moves the ftrace creation into include/trace/ftrace.h and
      simplifies the work of developers in adding new tracepoints.
      Just the act of creating the trace points in include/trace and including
      define_trace.h will create the events in the debugfs/tracing/events
      directory.
      
      This patch removes the need of include/trace/trace_events.h
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f42c85e7
    • Steven Rostedt's avatar
      tracing: consolidate trace and trace_event headers · ea20d929
      Steven Rostedt authored
      
      
      Impact: clean up
      
      Neil Horman (et. al.) criticized the way the trace events were broken up
      into two files. The reason for that was that ftrace needed to separate out
      the declarations from where the #include <linux/tracepoint.h> was used.
      It then dawned on me that the tracepoint.h header only needs to define the
      TRACE_EVENT macro if it is not already defined.
      
      The solution is simply to test if TRACE_EVENT is defined, and if it is not
      then the linux/tracepoint.h header can define it. This change consolidates
      all the <traces>.h and <traces>_event_types.h into the <traces>.h file.
      Reported-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatarTheodore Tso <tytso@mit.edu>
      Reported-by: default avatarJiaying Zhang <jiayingz@google.com>
      Cc: Zhaolei <zhaolei@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ea20d929
  14. 13 Apr, 2009 3 commits
    • Tom Zanussi's avatar
      tracing/filters: allow on-the-fly filter switching · 0a19e53c
      Tom Zanussi authored
      
      
      This patch allows event filters to be safely removed or switched
      on-the-fly while avoiding the use of rcu or the suspension of tracing of
      previous versions.
      
      It does it by adding a new filter_pred_none() predicate function which
      does nothing and by never deallocating either the predicates or any of
      the filter_pred members used in matching; the predicate lists are
      allocated and initialized during ftrace_event_calls initialization.
      
      Whenever a filter is removed or replaced, the filter_pred_* functions
      currently in use by the affected ftrace_event_call are immediately
      switched over to to the filter_pred_none() function, while the rest of
      the filter_pred members are left intact, allowing any currently
      executing filter_pred_* functions to finish up, using the values they're
      currently using.
      
      In the case of filter replacement, the new predicate values are copied
      into the old predicates after the above step, and the filter_pred_none()
      functions are replaced by the filter_pred_* functions for the new
      filter.  In this case, it is possible though very unlikely that a
      previous filter_pred_* is still running even after the
      filter_pred_none() switch and the switch to the new filter_pred_*.  In
      that case, however, because nothing has been deallocated in the
      filter_pred, the worst that can happen is that the old filter_pred_*
      function sees the new values and as a result produces either a false
      positive or a false negative, depending on the values it finds.
      
      So one downside to this method is that rarely, it can produce a bad
      match during the filter switch, but it should be possible to live with
      that, IMHO.
      
      The other downside is that at least in this patch the predicate lists
      are always pre-allocated, taking up memory from the start.  They could
      probably be allocated on first-use, and de-allocated when tracing is
      completely stopped - if this patch makes sense, I could create another
      one to do that later on.
      
      Oh, and it also places a restriction on the size of __arrays in events,
      currently set to 128, since they can't be larger than the now embedded
      str_val arrays in the filter_pred struct.
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: paulmck@linux.vnet.ibm.com
      LKML-Reference: <1239610670.6660.49.camel@tropicana>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0a19e53c
    • Tom Zanussi's avatar
      tracing/filters: use ring_buffer_discard_commit() in filter_check_discard() · eb02ce01
      Tom Zanussi authored
      
      
      This patch changes filter_check_discard() to make use of the new
      ring_buffer_discard_commit() function and modifies the current users to
      call the old commit function in the non-discard case.
      
      It also introduces a version of filter_check_discard() that uses the
      global trace buffer (filter_current_check_discard()) for those cases.
      
      v2 changes:
      
      - fix compile error noticed by Ingo Molnar
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      LKML-Reference: <1239178554.10295.36.camel@tropicana>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      eb02ce01
    • Steven Rostedt's avatar
      tracing/filters: use ring_buffer_discard_commit for discarded events · 77d9f465
      Steven Rostedt authored
      
      
      The ring_buffer_discard_commit makes better usage of the ring_buffer
      when an event has been discarded. It tries to remove it completely if
      possible.
      
      This patch converts the trace event filtering to use
      ring_buffer_discard_commit instead of the ring_buffer_event_discard.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      77d9f465
  15. 23 Mar, 2009 2 commits
    • Frederic Weisbecker's avatar
      tracing/events: don't discard an event after commit · b118415b
      Frederic Weisbecker authored
      
      
      When we want to filter an event, the filter test is done after
      the event is commited to the ring-buffer to be discarded later if
      needed.
      
      But a reader could be reading this event while we are trying to discard
      it. Other kind of racy events can even happen because the event is
      commited and can be read and/or consumed.
      
      What we want is to discard the event before committing it.
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <1237763919-21505-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b118415b
    • Frederic Weisbecker's avatar
      tracing/events: don't use wake up for events · 07edf712
      Frederic Weisbecker authored
      
      
      Impact: fix hard-lockup with sched switch events
      
      Some ftrace events, such as sched wakeup, can be traced
      while the runqueue lock is hold. Since they are using
      trace_current_buffer_unlock_commit(), they call wake_up()
      which can try to grab the runqueue lock too, resulting in
      a deadlock.
      
      Now for all event, we call a new helper:
      trace_nowake_buffer_unlock_commit() which do pretty the same than
      trace_current_buffer_unlock_commit() except than it doesn't call
      trace_wake_up().
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1237759847-21025-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      07edf712
  16. 22 Mar, 2009 2 commits
    • Tom Zanussi's avatar
      tracing: add per-event filtering · 7ce7e424
      Tom Zanussi authored
      
      
      This patch adds per-event filtering to the event tracing subsystem.
      
      It adds a 'filter' debugfs file to each event directory.  This file can
      be written to to set filters; reading from it will display the current
      set of filters set for that event.
      
      Basically, any field listed in the 'format' file for an event can be
      filtered on (including strings, but not yet other array types) using
      either matching ('==') or non-matching ('!=') 'predicates'.  A
      'predicate' can be either a single expression:
      
       # echo pid != 0 > filter
      
       # cat filter
       pid != 0
      
      or a compound expression of up to 8 sub-expressions combined using '&&'
      or '||':
      
       # echo comm == Xorg > filter
       # echo "&& sig != 29" > filter
      
       # cat filter
       comm == Xorg
       && sig != 29
      
      Only events having field values matching an expression will be available
      in the trace output; non-matching events are discarded.
      
      Note that a compound expression is built up by echoing each
      sub-expression separately - it's not the most efficient way to do
      things, but it keeps the parser simple and assumes that compound
      expressions will be relatively uncommon.  In any case, a subsequent
      patch introducing a way to set filters for entire subsystems should
      mitigate any need to do this for lots of events.
      
      Setting a filter without an '&&' or '||' clears the previous filter
      completely and sets the filter to the new expression:
      
       # cat filter
       comm == Xorg
       && sig != 29
      
       # echo comm != Xorg
      
       # cat filter
       comm != Xorg
      
      To clear a filter, echo 0 to the filter file:
      
       # echo 0 > filter
       # cat filter
       none
      
      The limit of 8 predicates for a compound expression is arbitrary - for
      efficiency, it's implemented as an array of pointers to predicates, and
      8 seemed more than enough for any filter...
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1237710665.7703.48.camel@charm-linux>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7ce7e424
    • Tom Zanussi's avatar
      tracing: add run-time field descriptions for event filtering · cf027f64
      Tom Zanussi authored
      
      
      This patch makes the field descriptions defined for event tracing
      available at run-time, for the event-filtering mechanism introduced
      in a subsequent patch.
      
      The common event fields are prepended with 'common_' in the format
      display, allowing them to be distinguished from the other fields
      that might internally have same name and can therefore be
      unambiguously used in filters.
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1237710639.7703.46.camel@charm-linux>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cf027f64
  17. 20 Mar, 2009 2 commits
  18. 10 Mar, 2009 3 commits
  19. 09 Mar, 2009 4 commits
    • Steven Rostedt's avatar
      tracing: remove obsolete TRACE_EVENT_FORMAT macro · 157587d7
      Steven Rostedt authored
      
      
      Impact: clean up
      
      The TRACE_EVENT_FORMAT macro is no longer used by trace points
      and only the DECLARE_TRACE, TRACE_FORMAT or TRACE_EVENT macros should
      be used by them. Although the TRACE_EVENT_FORMAT macro is still used
      by the internal tracing utility, it should not be used in core
      kernel code.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      157587d7
    • Steven Rostedt's avatar
      tracing: new format for specialized trace points · da4d0302
      Steven Rostedt authored
      
      
      Impact: clean up and enhancement
      
      The TRACE_EVENT_FORMAT macro looks quite ugly and is limited in its
      ability to save data as well as to print the record out. Working with
      Ingo Molnar, we came up with a new format that is much more pleasing to
      the eye of C developers. This new macro is more C style than the old
      macro, and is more obvious to what it does.
      
      Here's the example. The only updated macro in this patch is the
      sched_switch trace point.
      
      The old method looked like this:
      
       TRACE_EVENT_FORMAT(sched_switch,
              TP_PROTO(struct rq *rq, struct task_struct *prev,
                      struct task_struct *next),
              TP_ARGS(rq, prev, next),
              TP_FMT("task %s:%d ==> %s:%d",
                    prev->comm, prev->pid, next->comm, next->pid),
              TRACE_STRUCT(
                      TRACE_FIELD(pid_t, prev_pid, prev->pid)
                      TRACE_FIELD(int, prev_prio, prev->prio)
                      TRACE_FIELD_SPECIAL(char next_comm[TASK_COMM_LEN],
                                          next_comm,
                                          TP_CMD(memcpy(TRACE_ENTRY->next_comm,
                                                       next->comm,
                                                       TASK_COMM_LEN)))
                      TRACE_FIELD(pid_t, next_pid, next->pid)
                      TRACE_FIELD(int, next_prio, next->prio)
              ),
              TP_RAW_FMT("prev %d:%d ==> next %s:%d:%d")
              );
      
      The above method is hard to read and requires two format fields.
      
      The new method:
      
       /*
        * Tracepoint for task switches, performed by the scheduler:
        *
        * (NOTE: the 'rq' argument is not used by generic trace events,
        *        but used by the latency tracer plugin. )
        */
       TRACE_EVENT(sched_switch,
      
      	TP_PROTO(struct rq *rq, struct task_struct *prev,
      		 struct task_struct *next),
      
      	TP_ARGS(rq, prev, next),
      
      	TP_STRUCT__entry(
      		__array(	char,	prev_comm,	TASK_COMM_LEN	)
      		__field(	pid_t,	prev_pid			)
      		__field(	int,	prev_prio			)
      		__array(	char,	next_comm,	TASK_COMM_LEN	)
      		__field(	pid_t,	next_pid			)
      		__field(	int,	next_prio			)
      	),
      
      	TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
      		__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
      		__entry->next_comm, __entry->next_pid, __entry->next_prio),
      
      	TP_fast_assign(
      		memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
      		__entry->prev_pid	= prev->pid;
      		__entry->prev_prio	= prev->prio;
      		memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
      		__entry->next_pid	= next->pid;
      		__entry->next_prio	= next->prio;
      	)
       );
      
      This macro is called TRACE_EVENT, it is broken up into 5 parts:
      
       TP_PROTO:        the proto type of the trace point
       TP_ARGS:         the arguments of the trace point
       TP_STRUCT_entry: the structure layout of the entry in the ring buffer
       TP_printk:       the printk format
       TP_fast_assign:  the method used to write the entry into the ring buffer
      
      The structure is the definition of how the event will be saved in the
      ring buffer. The printk is used by the internal tracing in case of
      an oops, and the kernel needs to print out the format of the record
      to the console. This the TP_printk gives a means to show the records
      in a human readable format. It is also used to print out the data
      from the trace file.
      
      The TP_fast_assign is executed directly. It is basically like a C function,
      where the __entry is the handle to the record.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      da4d0302
    • Steven Rostedt's avatar
      tracing: use generic __stringify · 9cc26a26
      Steven Rostedt authored
      
      
      Impact: clean up
      
      This removes the custom made STR(x) macros in the tracer and uses
      the generic __stringify macro instead.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      9cc26a26
    • Steven Rostedt's avatar
      tracing: replace TP<var> with TP_<var> · 2939b046
      Steven Rostedt authored
      
      
      Impact: clean up
      
      The macros TPPROTO, TPARGS, TPFMT, TPRAWFMT, and TPCMD all look a bit
      ugly. This patch adds an underscore to their names.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      2939b046
  20. 04 Mar, 2009 1 commit
    • Peter Zijlstra's avatar
      tracing: add lockdep tracepoints for lock acquire/release · efed792d
      Peter Zijlstra authored
      
      
      Augment the traces with lock names when lockdep is available:
      
       1)               |  down_read_trylock() {
       1)               |    _spin_lock_irqsave() {
       1)               |      /* lock_acquire: &sem->wait_lock */
       1)   4.201 us    |    }
       1)               |    _spin_unlock_irqrestore() {
       1)               |      /* lock_release: &sem->wait_lock */
       1)   3.523 us    |    }
       1)               |  /* lock_acquire: try read &mm->mmap_sem */
       1) + 13.386 us   |  }
       1)   1.635 us    |  find_vma();
       1)               |  handle_mm_fault() {
       1)               |    __do_fault() {
       1)               |      filemap_fault() {
       1)               |        find_lock_page() {
       1)               |          find_get_page() {
       1)               |            /* lock_acquire: read rcu_read_lock */
       1)               |            /* lock_release: rcu_read_lock */
       1)   5.697 us    |          }
       1)   8.158 us    |        }
       1) + 11.079 us   |      }
       1)               |      _spin_lock() {
       1)               |        /* lock_acquire: __pte_lockptr(page) */
       1)   3.949 us    |      }
       1)   1.460 us    |      page_add_file_rmap();
       1)               |      _spin_unlock() {
       1)               |        /* lock_release: __pte_lockptr(page) */
       1)   3.115 us    |      }
       1)               |      unlock_page() {
       1)   1.421 us    |        page_waitqueue();
       1)   1.220 us    |        __wake_up_bit();
       1)   6.519 us    |      }
       1) + 34.328 us   |    }
       1) + 37.452 us   |  }
       1)               |  up_read() {
       1)               |  /* lock_release: &mm->mmap_sem */
       1)               |    _spin_lock_irqsave() {
       1)               |      /* lock_acquire: &sem->wait_lock */
       1)   3.865 us    |    }
       1)               |    _spin_unlock_irqrestore() {
       1)               |      /* lock_release: &sem->wait_lock */
       1)   8.562 us    |    }
       1) + 17.370 us   |  }
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: =?ISO-8859-1?Q?T=F6r=F6k?= Edwin <edwintorok@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1236166375.5330.7209.camel@laptop>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      efed792d
  21. 03 Mar, 2009 1 commit
  22. 02 Mar, 2009 1 commit
    • Steven Rostedt's avatar
      tracing: add format file to describe event struct fields · 981d081e
      Steven Rostedt authored
      
      
      This patch adds the "format" file to the trace point event directory.
      This is based off of work by Tom Zanussi, in which a file is exported
      to be tread from user land such that a user space app may read the
      binary record stored in the ring buffer.
      
       # cat /debug/tracing/events/sched/sched_switch/format
              field:pid_t prev_pid;   offset:12;      size:4;
              field:int prev_prio;    offset:16;      size:4;
              field special:char next_comm[TASK_COMM_LEN];    offset:20;      size:16;
              field:pid_t next_pid;   offset:36;      size:4;
              field:int next_prio;    offset:40;      size:4;
      
      Idea-from: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      981d081e