Skip to content
Snippets Groups Projects
  1. Apr 28, 2011
  2. Apr 04, 2011
  3. Mar 16, 2011
  4. Mar 08, 2011
  5. Mar 01, 2011
  6. Feb 17, 2011
  7. Jan 14, 2011
  8. Jan 13, 2011
  9. Nov 15, 2010
    • Vivek Goyal's avatar
      blk-cgroup: Allow creation of hierarchical cgroups · bdc85df7
      Vivek Goyal authored
      
      o Allow hierarchical cgroup creation for blkio controller
      
      o Currently we disallow it as both the io controller policies (throttling
        as well as proportion bandwidth) do not support hierarhical accounting
        and control. But the flip side is that blkio controller can not be used with
        libvirt as libvirt creates a cgroup hierarchy deeper than 1 level.
      
        <top-level-cgroup-dir>/<controller>/libvirt/qemu/<virtual-machine-groups>
      
      o So this patch will allow creation of cgroup hierarhcy but at the backend
        everything will be treated as flat. So if somebody created a an hierarchy
        like as follows.
      
      			root
      			/  \
      		     test1 test2
      			|
      		     test3
      
        CFQ and throttling will practically treat all groups at same level.
      
      				pivot
      			     /  |   \  \
      			root  test1 test2  test3
      
      o Once we have actual support for hierarchical accounting and control
        then we can introduce another cgroup tunable file "blkio.use_hierarchy"
        which will be 0 by default but if user wants to enforce hierarhical
        control then it can be set to 1. This way there should not be any
        ABI problems down the line.
      
      o The only not so pretty part is introduction of extra file "use_hierarchy"
        down the line. Kame-san had mentioned that hierarhical accounting is
        expensive in memory controller hence they keep it off by default. I
        suspect same will be the case for IO controller also as for each IO
        completion we shall have to account IO through hierarchy up to the root.
        if yes, then it probably is not a very bad idea to introduce this extra
        file so that it will be used only when somebody needs it and some people
        might enable hierarchy only in part of the hierarchy.
      
      o This is how basically memory controller also uses "use_hierarhcy" and
        they also allowed creation of hierarchies when actual backend support
        was not available.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: default avatarBalbir Singh <balbir@linux.vnet.ibm.com>
      Reviewed-by: default avatarGui Jianfeng <guijianfeng@cn.fujitsu.com>
      Reviewed-by: default avatarCiju Rajan K <ciju@linux.vnet.ibm.com>
      Tested-by: default avatarCiju Rajan K <ciju@linux.vnet.ibm.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      bdc85df7
  10. Nov 01, 2010
  11. Oct 27, 2010
    • Daniel Lezcano's avatar
      cgroup: add clone_children control file · 97978e6d
      Daniel Lezcano authored
      The ns_cgroup is a control group interacting with the namespaces.  When a
      new namespace is created, a corresponding cgroup is automatically created
      too.  The cgroup name is the pid of the process who did 'unshare' or the
      child of 'clone'.
      
      This cgroup is tied with the namespace because it prevents a process to
      escape the control group and use the post_clone callback, so the child
      cgroup inherits the values of the parent cgroup.
      
      Unfortunately, the more we use this cgroup and the more we are facing
      problems with it:
      
      (1) when a process unshares, the cgroup name may conflict with a
          previous cgroup with the same pid, so unshare or clone return -EEXIST
      
      (2) the cgroup creation is out of control because there may have an
          application creating several namespaces where the system will
          automatically create several cgroups in his back and let them on the
          cgroupfs (eg.  a vrf based on the network namespace).
      
      (3) the mix of (1) and (2) force an administrator to regularly check
          and clean these cgroups.
      
      This patchset removes the ns_cgroup by adding a new flag to the cgroup and
      the cgroupfs mount option.  It enables the copy of the parent cgroup when
      a child cgroup is created.  We can then safely remove the ns_cgroup as
      this flag brings a compatibility.  We have now to manually create and add
      the task to a cgroup, which is consistent with the cgroup framework.
      
      This patch:
      
      Sent as an answer to a previous thread around the ns_cgroup.
      
      https://lists.linux-foundation.org/pipermail/containers/2009-June/018627.html
      
      
      
      It adds a control file 'clone_children' for a cgroup.  This control file
      is a boolean specifying if the child cgroup should be a clone of the
      parent cgroup or not.  The default value is 'false'.
      
      This flag makes the child cgroup to call the post_clone callback of all
      the subsystem, if it is available.
      
      At present, the cpuset is the only one which had implemented the
      post_clone callback.
      
      The option can be set at mount time by specifying the 'clone_children'
      mount option.
      
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@free.fr>
      Signed-off-by: default avatarSerge E. Hallyn <serge.hallyn@canonical.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Acked-by: default avatarPaul Menage <menage@google.com>
      Reviewed-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jamal Hadi Salim <hadi@cyberus.ca>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Acked-by: default avatarBalbir Singh <balbir@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      97978e6d
  12. Sep 16, 2010
  13. Aug 23, 2010
  14. Aug 04, 2010
    • Justin P. Mattock's avatar
      Documentation: update broken web addresses. · 0ea6e611
      Justin P. Mattock authored
      
      Below you will find an updated version from the original series bunching all patches into one big patch
      updating broken web addresses that are located in Documentation/*
      Some of the addresses date as far far back as 1995 etc... so searching became a bit difficult,
      the best way to deal with these is to use web.archive.org to locate these addresses that are outdated.
      Now there are also some addresses pointing to .spec files some are located, but some(after searching
      on the companies site)where still no where to be found. In this case I just changed the address
      to the company site this way the users can contact the company and they can locate them for the users.
      
      Signed-off-by: default avatarJustin P. Mattock <justinmattock@gmail.com>
      Signed-off-by: default avatarThomas Weber <weber@corscience.de>
      Signed-off-by: default avatarMike Frysinger <vapier.adi@gmail.com>
      Cc: Paulo Marques <pmarques@grupopie.com>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: Michael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      0ea6e611
  15. May 27, 2010
  16. Apr 26, 2010
    • Vivek Goyal's avatar
      blk-cgroup: config options re-arrangement · afc24d49
      Vivek Goyal authored
      
      This patch fixes few usability and configurability issues.
      
      o All the cgroup based controller options are configurable from
        "Genral Setup/Control Group Support/" menu. blkio is the only exception.
        Hence make this option visible in above menu and make it configurable from
        there to bring it inline with rest of the cgroup based controllers.
      
      o Get rid of CONFIG_DEBUG_CFQ_IOSCHED.
      
        This option currently does two things.
      
        - Enable printing of cgroup paths in blktrace
        - Enables CONFIG_DEBUG_BLK_CGROUP, which in turn displays additional stat
          files in cgroup.
      
        If we are using group scheduling, blktrace data is of not really much use
        if cgroup information is not present. To get this data, currently one has to
        also enable CONFIG_DEBUG_CFQ_IOSCHED, which in turn brings the overhead of
        all the additional debug stat files which is not desired.
      
        Hence, this patch moves printing of cgroup paths under
        CONFIG_CFQ_GROUP_IOSCHED.
      
        This allows us to get rid of CONFIG_DEBUG_CFQ_IOSCHED completely. Now all
        the debug stat files are controlled only by CONFIG_DEBUG_BLK_CGROUP which
        can be enabled through config menu.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: default avatarDivyesh Shah <dpshah@google.com>
      Reviewed-by: default avatarGui Jianfeng <guijianfeng@cn.fujitsu.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      afc24d49
  17. Apr 24, 2010
  18. Apr 22, 2010
  19. Apr 13, 2010
  20. Apr 09, 2010
    • Divyesh Shah's avatar
      blkio: Add more debug-only per-cgroup stats · 812df48d
      Divyesh Shah authored
      
      1) group_wait_time - This is the amount of time the cgroup had to wait to get a
        timeslice for one of its queues from when it became busy, i.e., went from 0
        to 1 request queued. This is different from the io_wait_time which is the
        cumulative total of the amount of time spent by each IO in that cgroup waiting
        in the scheduler queue. This stat is a great way to find out any jobs in the
        fleet that are being starved or waiting for longer than what is expected (due
        to an IO controller bug or any other issue).
      2) empty_time - This is the amount of time a cgroup spends w/o any pending
         requests. This stat is useful when a job does not seem to be able to use its
         assigned disk share by helping check if that is happening due to an IO
         controller bug or because the job is not submitting enough IOs.
      3) idle_time - This is the amount of time spent by the IO scheduler idling
         for a given cgroup in anticipation of a better request than the exising ones
         from other queues/cgroups.
      
      All these stats are recorded using start and stop events. When reading these
      stats, we do not add the delta between the current time and the last start time
      if we're between the start and stop events. We avoid doing this to make sure
      that these numbers are always monotonically increasing when read. Since we're
      using sched_clock() which may use the tsc as its source, it may induce some
      inconsistency (due to tsc resync across cpus) if we included the current delta.
      
      Signed-off-by: default avatarDivyesh <Shah&lt;dpshah@google.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      812df48d
    • Divyesh Shah's avatar
      blkio: Add io_queued and avg_queue_size stats · cdc1184c
      Divyesh Shah authored
      
      These stats are useful for getting a feel for the queue depth of the cgroup,
      i.e., how filled up its queues are at a given instant and over the existence of
      the cgroup. This ability is useful when debugging problems in the wild as it
      helps understand the application's IO pattern w/o having to read through the
      userspace code (coz its tedious or just not available) or w/o the ability
      to run blktrace (since you may not have root access and/or not want to disturb
      performance).
      
      Signed-off-by: default avatarDivyesh <Shah&lt;dpshah@google.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      cdc1184c
    • Divyesh Shah's avatar
      blkio: Add io_merged stat · 812d4026
      Divyesh Shah authored
      
      This includes both the number of bios merged into requests belonging to this
      cgroup as well as the number of requests merged together.
      In the past, we've observed different merging behavior across upstream kernels,
      some by design some actual bugs. This stat helps a lot in debugging such
      problems when applications report decreased throughput with a new kernel
      version.
      
      This needed adding an extra elevator function to capture bios being merged as I
      did not want to pollute elevator code with blkiocg knowledge and hence needed
      the accounting invocation to come from CFQ.
      
      Signed-off-by: default avatarDivyesh <Shah&lt;dpshah@google.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      812d4026
    • Divyesh Shah's avatar
      blkio: Changes to IO controller additional stats patches · 84c124da
      Divyesh Shah authored
      
      that include some minor fixes and addresses all comments.
      
      Changelog: (most based on Vivek Goyal's comments)
      o renamed blkiocg_reset_write to blkiocg_reset_stats
      o more clarification in the documentation on io_service_time and io_wait_time
      o Initialize blkg->stats_lock
      o rename io_add_stat to blkio_add_stat and declare it static
      o use bool for direction and sync
      o derive direction and sync info from existing rq methods
      o use 12 for major:minor string length
      o define io_service_time better to cover the NCQ case
      o add a separate reset_stats interface
      o make the indexed stats a 2d array to simplify macro and function pointer code
      o blkio.time now exports in jiffies as before
      o Added stats description in patch description and
        Documentation/cgroup/blkio-controller.txt
      o Prefix all stats functions with blkio and make them static as applicable
      o replace IO_TYPE_MAX with IO_TYPE_TOTAL
      o Moved #define constant to top of blk-cgroup.c
      o Pass dev_t around instead of char *
      o Add note to documentation file about resetting stats
      o use BLK_CGROUP_MODULE in addition to BLK_CGROUP config option in #ifdef
        statements
      o Avoid struct request specific knowledge in blk-cgroup. blk-cgroup.h now has
        rq_direction() and rq_sync() functions which are used by CFQ and when using
        io-controller at a higher level, bio_* functions can be added.
      
      Signed-off-by: default avatarDivyesh <Shah&lt;dpshah@google.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      84c124da
  21. Mar 25, 2010
  22. Mar 24, 2010
  23. Mar 19, 2010
  24. Mar 16, 2010
  25. Mar 12, 2010
Loading