Skip to content
Snippets Groups Projects
  1. Oct 29, 2010
  2. Oct 28, 2010
  3. Oct 26, 2010
    • Eric Paris's avatar
      IMA: fix the ToMToU logic · bade72d6
      Eric Paris authored
      
      Current logic looks like this:
      
              rc = ima_must_measure(NULL, inode, MAY_READ, FILE_CHECK);
              if (rc < 0)
                      goto out;
      
              if (mode & FMODE_WRITE) {
                      if (inode->i_readcount)
                              send_tomtou = true;
                      goto out;
              }
      
              if (atomic_read(&inode->i_writecount) > 0)
                      send_writers = true;
      
      Lets assume we have a policy which states that all files opened for read
      by root must be measured.
      
      Lets assume the file has permissions 777.
      
      Lets assume that root has the given file open for read.
      
      Lets assume that a non-root process opens the file write.
      
      The non-root process will get to ima_counts_get() and will check the
      ima_must_measure().  Since it is not supposed to measure it will goto
      out.
      
      We should check the i_readcount no matter what since we might be causing
      a ToMToU voilation!
      
      This is close to correct, but still not quite perfect.  The situation
      could have been that root, which was interested in the mesurement opened
      and closed the file and another process which is not interested in the
      measurement is the one holding the i_readcount ATM.  This is just overly
      strict on ToMToU violations, which is better than not strict enough...
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bade72d6
    • Eric Paris's avatar
      IMA: explicit IMA i_flag to remove global lock on inode_delete · 196f5181
      Eric Paris authored
      
      Currently for every removed inode IMA must take a global lock and search
      the IMA rbtree looking for an associated integrity structure.  Instead
      we explicitly mark an inode when we add an integrity structure so we
      only have to take the global lock and do the removal if it exists.
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      196f5181
    • Eric Paris's avatar
      IMA: drop refcnt from ima_iint_cache since it isn't needed · 64c62f06
      Eric Paris authored
      
      Since finding a struct ima_iint_cache requires a valid struct inode, and
      the struct ima_iint_cache is supposed to have the same lifetime as a
      struct inode (technically they die together but don't need to be created
      at the same time) we don't have to worry about the ima_iint_cache
      outliving or dieing before the inode.  So the refcnt isn't useful.  Just
      get rid of it and free the structure when the inode is freed.
      
      Signed-off-by: default avatarEric Paris <eapris@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      64c62f06
    • Eric Paris's avatar
      IMA: only allocate iint when needed · bc7d2a3e
      Eric Paris authored
      
      IMA always allocates an integrity structure to hold information about
      every inode, but only needed this structure to track the number of
      readers and writers currently accessing a given inode.  Since that
      information was moved into struct inode instead of the integrity struct
      this patch stops allocating the integrity stucture until it is needed.
      Thus greatly reducing memory usage.
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bc7d2a3e
    • Eric Paris's avatar
      IMA: move read counter into struct inode · a178d202
      Eric Paris authored
      
      IMA currently allocated an inode integrity structure for every inode in
      core.  This stucture is about 120 bytes long.  Most files however
      (especially on a system which doesn't make use of IMA) will never need
      any of this space.  The problem is that if IMA is enabled we need to
      know information about the number of readers and the number of writers
      for every inode on the box.  At the moment we collect that information
      in the per inode iint structure and waste the rest of the space.  This
      patch moves those counters into the struct inode so we can eventually
      stop allocating an IMA integrity structure except when absolutely
      needed.
      
      This patch does the minimum needed to move the location of the data.
      Further cleanups, especially the location of counter updates, may still
      be possible.
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a178d202
    • Eric Paris's avatar
      IMA: use i_writecount rather than a private counter · b9593d30
      Eric Paris authored
      
      IMA tracks the number of struct files which are holding a given inode
      readonly and the number which are holding the inode write or r/w.  It
      needs this information so when a new reader or writer comes in it can
      tell if this new file will be able to invalidate results it already made
      about existing files.
      
      aka if a task is holding a struct file open RO, IMA measured the file
      and recorded those measurements and then a task opens the file RW IMA
      needs to note in the logs that the old measurement may not be correct.
      It's called a "Time of Measure Time of Use" (ToMToU) issue.  The same is
      true is a RO file is opened to an inode which has an open writer.  We
      cannot, with any validity, measure the file in question since it could
      be changing.
      
      This patch attempts to use the i_writecount field to track writers.  The
      i_writecount field actually embeds more information in it's value than
      IMA needs but it should work for our purposes and allow us to shrink the
      struct inode even more.
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b9593d30
    • Eric Paris's avatar
      IMA: use inode->i_lock to protect read and write counters · ad16ad00
      Eric Paris authored
      
      Currently IMA used the iint->mutex to protect the i_readcount and
      i_writecount.  This patch uses the inode->i_lock since we are going to
      start using in inode objects and that is the most appropriate lock.
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ad16ad00
    • Eric Paris's avatar
      IMA: convert internal flags from long to char · 15aac676
      Eric Paris authored
      
      The IMA flags is an unsigned long but there is only 1 flag defined.
      Lets save a little space and make it a char.  This packs nicely next to
      the array of u8's.
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      15aac676
    • Eric Paris's avatar
      IMA: use unsigned int instead of long for counters · 497f3233
      Eric Paris authored
      
      Currently IMA uses 2 longs in struct inode.  To save space (and as it
      seems impossible to overflow 32 bits) we switch these to unsigned int.
      The switch to unsigned does require slightly different checks for
      underflow, but it isn't complex.
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      497f3233
    • Eric Paris's avatar
      IMA: drop the inode opencount since it isn't needed for operation · b575156d
      Eric Paris authored
      
      The opencount was used to help debugging to make sure that everything
      which created a struct file also correctly made the IMA calls.  Since we
      moved all of that into the VFS this isn't as necessary.  We should be
      able to get the same amount of debugging out of just the reader and
      write count.
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b575156d
    • Eric Paris's avatar
      IMA: use rbtree instead of radix tree for inode information cache · 85491641
      Eric Paris authored
      
      The IMA code needs to store the number of tasks which have an open fd
      granting permission to write a file even when IMA is not in use.  It
      needs this information in order to be enabled at a later point in time
      without losing it's integrity garantees.
      
      At the moment that means we store a little bit of data about every inode
      in a cache.  We use a radix tree key'd on the inode's memory address.
      Dave Chinner pointed out that a radix tree is a terrible data structure
      for such a sparse key space.  This patch switches to using an rbtree
      which should be more efficient.
      
      Bug report from Dave:
      
       "I just noticed that slabtop was reporting an awfully high usage of
        radix tree nodes:
      
         OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
        4200331 2778082  66%    0.55K 144839       29   2317424K radix_tree_node
        2321500 2060290  88%    1.00K  72581       32   2322592K xfs_inode
        2235648 2069791  92%    0.12K  69864       32    279456K iint_cache
      
        That is, 2.7M radix tree nodes are allocated, and the cache itself is
        consuming 2.3GB of RAM.  I know that the XFS inodei caches are indexed
        by radix tree node, but for 2 million cached inodes that would mean a
        density of 1 inode per radix tree node, which for a system with 16M
        inodes in the filsystems is an impossibly low density.  The worst I've
        seen in a production system like kernel.org is about 20-25% density,
        which would mean about 150-200k radix tree nodes for that many inodes.
        So it's not the inode cache.
      
        So I looked up what the iint_cache was.  It appears to used for
        storing per-inode IMA information, and uses a radix tree for indexing.
        It uses the *address* of the struct inode as the indexing key.  That
        means the key space is extremely sparse - for XFS the struct inode
        addresses are approximately 1000 bytes apart, which means the closest
        the radix tree index keys get is ~1000.  Which means that there is a
        single entry per radix tree leaf node, so the radix tree is using
        roughly 550 bytes for every 120byte structure being cached.  For the
        above example, it's probably wasting close to 1GB of RAM...."
      
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      85491641
  4. Oct 25, 2010
  5. Oct 20, 2010
  6. Oct 15, 2010
    • Arnd Bergmann's avatar
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann authored
      
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
  7. Sep 26, 2010
  8. Sep 10, 2010
    • David Howells's avatar
      KEYS: Fix bug in keyctl_session_to_parent() if parent has no session keyring · 3d96406c
      David Howells authored
      
      Fix a bug in keyctl_session_to_parent() whereby it tries to check the ownership
      of the parent process's session keyring whether or not the parent has a session
      keyring [CVE-2010-2960].
      
      This results in the following oops:
      
        BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
        IP: [<ffffffff811ae4dd>] keyctl_session_to_parent+0x251/0x443
        ...
        Call Trace:
         [<ffffffff811ae2f3>] ? keyctl_session_to_parent+0x67/0x443
         [<ffffffff8109d286>] ? __do_fault+0x24b/0x3d0
         [<ffffffff811af98c>] sys_keyctl+0xb4/0xb8
         [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b
      
      if the parent process has no session keyring.
      
      If the system is using pam_keyinit then it mostly protected against this as all
      processes derived from a login will have inherited the session keyring created
      by pam_keyinit during the log in procedure.
      
      To test this, pam_keyinit calls need to be commented out in /etc/pam.d/.
      
      Reported-by: default avatarTavis Ormandy <taviso@cmpxchg8b.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarTavis Ormandy <taviso@cmpxchg8b.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3d96406c
    • David Howells's avatar
      KEYS: Fix RCU no-lock warning in keyctl_session_to_parent() · 9d1ac65a
      David Howells authored
      
      There's an protected access to the parent process's credentials in the middle
      of keyctl_session_to_parent().  This results in the following RCU warning:
      
        ===================================================
        [ INFO: suspicious rcu_dereference_check() usage. ]
        ---------------------------------------------------
        security/keys/keyctl.c:1291 invoked rcu_dereference_check() without protection!
      
        other info that might help us debug this:
      
        rcu_scheduler_active = 1, debug_locks = 0
        1 lock held by keyctl-session-/2137:
         #0:  (tasklist_lock){.+.+..}, at: [<ffffffff811ae2ec>] keyctl_session_to_parent+0x60/0x236
      
        stack backtrace:
        Pid: 2137, comm: keyctl-session- Not tainted 2.6.36-rc2-cachefs+ #1
        Call Trace:
         [<ffffffff8105606a>] lockdep_rcu_dereference+0xaa/0xb3
         [<ffffffff811ae379>] keyctl_session_to_parent+0xed/0x236
         [<ffffffff811af77e>] sys_keyctl+0xb4/0xb6
         [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b
      
      The code should take the RCU read lock to make sure the parents credentials
      don't go away, even though it's holding a spinlock and has IRQ disabled.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d1ac65a
  9. Sep 07, 2010
Loading