1. 17 Mar, 2016 1 commit
    • John Stultz's avatar
      timer: convert timer_slack_ns from unsigned long to u64 · da8b44d5
      John Stultz authored
      This patchset introduces a /proc/<pid>/timerslack_ns interface which
      would allow controlling processes to be able to set the timerslack value
      on other processes in order to save power by avoiding wakeups (Something
      Android currently does via out-of-tree patches).
      
      The first patch tries to fix the internal timer_slack_ns usage which was
      defined as a long, which limits the slack range to ~4 seconds on 32bit
      systems.  It converts it to a u64, which provides the same basically
      unlimited slack (500 years) on both 32bit and 64bit machines.
      
      The second patch introduces the /proc/<pid>/timerslack_ns interface
      which allows the full 64bit slack range for a task to be read or set on
      both 32bit and 64bit machines.
      
      With these two patches, on a 32bit machine, after setting the slack on
      bash to 10 seconds:
      
      $ time sleep 1
      
      real    0m10.747s
      user    0m0.001s
      sys     0m0.005s
      
      The first patch is a little ugly, since I had to chase the slack delta
      arguments through a number of functions converting them to u64s.  Let me
      know if it makes sense to break that up more or not.
      
      Other than that things are fairly straightforward.
      
      This patch (of 2):
      
      The timer_slack_ns value in the task struct is currently a unsigned
      long.  This means that on 32bit applications, the maximum slack is just
      over 4 seconds.  However, on 64bit machines, its much much larger (~500
      years).
      
      This disparity could make application development a little (as well as
      the default_slack) to a u64.  This means both 32bit and 64bit systems
      have the same effective internal slack range.
      
      Now the existing ABI via PR_GET_TIMERSLACK and PR_SET_TIMERSLACK specify
      the interface as a unsigned long, so we preserve that limitation on
      32bit systems, where SET_TIMERSLACK can only set the slack to a unsigned
      long value, and GET_TIMERSLACK will return ULONG_MAX if the slack is
      actually larger then what can be stored by an unsigned long.
      
      This patch also modifies hrtimer functions which specified the slack
      delta as a unsigned long.
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Oren Laadan <orenl@cellrox.com>
      Cc: Ruchi Kandoi <kandoiruchi@google.com>
      Cc: Rom Lemarchand <romlem@android.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Android Kernel Team <kernel-team@android.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      da8b44d5
  2. 03 Nov, 2014 2 commits
  3. 12 May, 2013 6 commits
    • Colin Cross's avatar
      freezer: add new freezable helpers using freezer_do_not_count() · dd5ec0f4
      Colin Cross authored
      Freezing tasks will wake up almost every userspace task from
      where it is blocking and force it to run until it hits a
      call to try_to_sleep(), generally on the exit path from the syscall
      it is blocking in.  On resume each task will run again, usually
      restarting the syscall and running until it hits the same
      blocking call as it was originally blocked in.
      
      To allow tasks to avoid running on every suspend/resume cycle,
      this patch adds additional freezable wrappers around blocking calls
      that call freezer_do_not_count().  Combined with the previous patch,
      these tasks will not run during suspend or resume unless they wake
      up for another reason, in which case they will run until they hit
      the try_to_freeze() in freezer_count(), and then continue processing
      the wakeup after tasks are thawed.
      
      Additional patches will convert the most common locations that
      userspace blocks in to use freezable helpers.
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarColin Cross <ccross@android.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      dd5ec0f4
    • Colin Cross's avatar
      freezer: convert freezable helpers to static inline where possible · 8ee492d6
      Colin Cross authored
      Some of the freezable helpers have to be macros because their
      condition argument needs to get evaluated every time through
      the wait loop.  Convert the others to static inline to make
      future changes easier.
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarColin Cross <ccross@android.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      8ee492d6
    • Colin Cross's avatar
      freezer: convert freezable helpers to freezer_do_not_count() · b0123586
      Colin Cross authored
      Freezing tasks will wake up almost every userspace task from
      where it is blocking and force it to run until it hits a
      call to try_to_sleep(), generally on the exit path from the syscall
      it is blocking in.  On resume each task will run again, usually
      restarting the syscall and running until it hits the same
      blocking call as it was originally blocked in.
      
      Convert the existing wait_event_freezable* wrappers to use
      freezer_do_not_count().  Combined with a previous patch,
      these tasks will not run during suspend or resume unless they wake
      up for another reason, in which case they will run until they hit
      the try_to_freeze() in freezer_count(), and then continue processing
      the wakeup after tasks are thawed.
      
      This results in a small change in behavior, previously a race
      between freezing and a normal wakeup would be won by the wakeup,
      now the task will freeze and then handle the wakeup after thawing.
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarColin Cross <ccross@android.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      b0123586
    • Mandeep Singh Baines's avatar
      lockdep: check that no locks held at freeze time · 0f9548ca
      Mandeep Singh Baines authored
      We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
      deadlock if the lock is later acquired in the suspend or hibernate path
      (e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
      cgroup_freezer if a lock is held inside a frozen cgroup that is later
      acquired by a process outside that group.
      
      History:
      This patch was originally applied as 6aa97070 and reverted in
      dbf520a9 because NFS was freezing with locks held.  It was
      deemed better to keep the bad freeze point in NFS to allow laptops
      to suspend consistently.  The previous patch in this series converts
      NFS to call _unsafe versions of the freezable helpers so that
      lockdep doesn't complain about them until a more correct fix
      can be applied.
      
      [akpm@linux-foundation.org: export debug_check_no_locks_held]
      Signed-off-by: default avatarMandeep Singh Baines <msb@chromium.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarColin Cross <ccross@android.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      0f9548ca
    • Colin Cross's avatar
      freezer: add unsafe versions of freezable helpers for CIFS · 5853cc2a
      Colin Cross authored
      CIFS calls wait_event_freezekillable_unsafe with a VFS lock held,
      which is unsafe and will cause lockdep warnings when 6aa97070
      "lockdep: check that no locks held at freeze time" is reapplied
      (it was reverted in dbf520a9).  CIFS shouldn't be doing this, but
      it has long-running syscalls that must hold a lock but also
      shouldn't block suspend.  Until CIFS freeze handling is rewritten
      to use a signal to exit out of the critical section, add a new
      wait_event_freezekillable_unsafe helper that will not run the
      lockdep test when 6aa97070 is reapplied, and call it from CIFS.
      
      In practice the likley result of holding the lock while freezing
      is that a second task blocked on the lock will never freeze,
      aborting suspend, but it is possible to manufacture a case using
      the cgroup freezer, the lock, and the suspend freezer to create
      a deadlock.  Silencing the lockdep warning here will allow
      problems to be found in other drivers that may have a more
      serious deadlock risk, and prevent new problems from being added.
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarColin Cross <ccross@android.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      5853cc2a
    • Colin Cross's avatar
      freezer: add unsafe versions of freezable helpers for NFS · 416ad3c9
      Colin Cross authored
      NFS calls the freezable helpers with locks held, which is unsafe
      and will cause lockdep warnings when 6aa97070 "lockdep: check
      that no locks held at freeze time" is reapplied (it was reverted
      in dbf520a9).  NFS shouldn't be doing this, but it has
      long-running syscalls that must hold a lock but also shouldn't
      block suspend.  Until NFS freeze handling is rewritten to use a
      signal to exit out of the critical section, add new *_unsafe
      versions of the helpers that will not run the lockdep test when
      6aa97070 is reapplied, and call them from NFS.
      
      In practice the likley result of holding the lock while freezing
      is that a second task blocked on the lock will never freeze,
      aborting suspend, but it is possible to manufacture a case using
      the cgroup freezer, the lock, and the suspend freezer to create
      a deadlock.  Silencing the lockdep warning here will allow
      problems to be found in other drivers that may have a more
      serious deadlock risk, and prevent new problems from being added.
      Signed-off-by: default avatarColin Cross <ccross@android.com>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      416ad3c9
  4. 31 Mar, 2013 1 commit
    • Paul Walmsley's avatar
      Revert "lockdep: check that no locks held at freeze time" · dbf520a9
      Paul Walmsley authored
      This reverts commit 6aa97070.
      
      Commit 6aa97070 ("lockdep: check that no locks held at freeze time")
      causes problems with NFS root filesystems.  The failures were noticed on
      OMAP2 and 3 boards during kernel init:
      
        [ BUG: swapper/0/1 still has locks held! ]
        3.9.0-rc3-00344-ga937536b #1 Not tainted
        -------------------------------------
        1 lock held by swapper/0/1:
         #0:  (&type->s_umount_key#13/1){+.+.+.}, at: [<c011e84c>] sget+0x248/0x574
      
        stack backtrace:
          rpc_wait_bit_killable
          __wait_on_bit
          out_of_line_wait_on_bit
          __rpc_execute
          rpc_run_task
          rpc_call_sync
          nfs_proc_get_root
          nfs_get_root
          nfs_fs_mount_common
          nfs_try_mount
          nfs_fs_mount
          mount_fs
          vfs_kern_mount
          do_mount
          sys_mount
          do_mount_root
          mount_root
          prepare_namespace
          kernel_init_freeable
          kernel_init
      
      Although the rootfs mounts, the system is unstable.  Here's a transcript
      from a PM test:
      
        http://www.pwsan.com/omap/testlogs/test_v3.9-rc3/20130317194234/pm/37xxevm/37xxevm_log.txt
      
      Here's what the test log should look like:
      
        http://www.pwsan.com/omap/testlogs/test_v3.8/20130218214403/pm/37xxevm/37xxevm_log.txt
      
      Mailing list discussion is here:
      
        http://lkml.org/lkml/2013/3/4/221
      
      Deal with this for v3.9 by reverting the problem commit, until folks can
      figure out the right long-term course of action.
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Cc: Mandeep Singh Baines <msb@chromium.org>
      Cc: Jeff Layton <jlayton@redhat.com>
      Cc: Shawn Guo <shawn.guo@linaro.org>
      Cc: <maciej.rutecki@gmail.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ben Chan <benchan@chromium.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dbf520a9
  5. 27 Feb, 2013 1 commit
  6. 09 Feb, 2013 1 commit
    • Li Fei's avatar
      suspend: enable freeze timeout configuration through sys · 957d1282
      Li Fei authored
      At present, the value of timeout for freezing is 20s, which is
      meaningless in case that one thread is frozen with mutex locked
      and another thread is trying to lock the mutex, as this time of
      freezing will fail unavoidably.
      And if there is no new wakeup event registered, the system will
      waste at most 20s for such meaningless trying of freezing.
      
      With this patch, the value of timeout can be configured to smaller
      value, so such meaningless trying of freezing will be aborted in
      earlier time, and later freezing can be also triggered in earlier
      time. And more power will be saved.
      In normal case on mobile phone, it costs real little time to freeze
      processes. On some platform, it only costs about 20ms to freeze
      user space processes and 10ms to freeze kernel freezable threads.
      Signed-off-by: default avatarLiu Chuansheng <chuansheng.liu@intel.com>
      Signed-off-by: default avatarLi Fei <fei.li@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      957d1282
  7. 23 Nov, 2012 1 commit
  8. 26 Oct, 2012 1 commit
    • Oleg Nesterov's avatar
      freezer: change ptrace_stop/do_signal_stop to use freezable_schedule() · 5d8f72b5
      Oleg Nesterov authored
      try_to_freeze_tasks() and cgroup_freezer rely on scheduler locks
      to ensure that a task doing STOPPED/TRACED -> RUNNING transition
      can't escape freezing. This mostly works, but ptrace_stop() does
      not necessarily call schedule(), it can change task->state back to
      RUNNING and check freezing() without any lock/barrier in between.
      
      We could add the necessary barrier, but this patch changes
      ptrace_stop() and do_signal_stop() to use freezable_schedule().
      This fixes the race, freezer_count() and freezer_should_skip()
      carefully avoid the race.
      
      And this simplifies the code, try_to_freeze_tasks/update_if_frozen
      no longer need to use task_is_stopped_or_traced() checks with the
      non trivial assumptions. We can rely on the mechanism which was
      specially designed to mark the sleeping task as "frozen enough".
      
      v2: As Tejun pointed out, we can also change get_signal_to_deliver()
      and move try_to_freeze() up before 'relock' label.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      5d8f72b5
  9. 16 Oct, 2012 1 commit
    • Tejun Heo's avatar
      freezer: add missing mb's to freezer_count() and freezer_should_skip() · dd67d32d
      Tejun Heo authored
      A task is considered frozen enough between freezer_do_not_count() and
      freezer_count() and freezers use freezer_should_skip() to test this
      condition.  This supposedly works because freezer_count() always calls
      try_to_freezer() after clearing %PF_FREEZER_SKIP.
      
      However, there currently is nothing which guarantees that
      freezer_count() sees %true freezing() after clearing %PF_FREEZER_SKIP
      when freezing is in progress, and vice-versa.  A task can escape the
      freezing condition in effect by freezer_count() seeing !freezing() and
      freezer_should_skip() seeing %PF_FREEZER_SKIP.
      
      This patch adds smp_mb()'s to freezer_count() and
      freezer_should_skip() such that either %true freezing() is visible to
      freezer_count() or !PF_FREEZER_SKIP is visible to
      freezer_should_skip().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: stable@vger.kernel.org
      dd67d32d
  10. 29 Jan, 2012 1 commit
    • Rafael J. Wysocki's avatar
      PM / Hibernate: Fix s2disk regression related to freezing workqueues · 181e9bde
      Rafael J. Wysocki authored
      Commit 2aede851
      
        PM / Hibernate: Freeze kernel threads after preallocating memory
      
      introduced a mechanism by which kernel threads were frozen after
      the preallocation of hibernate image memory to avoid problems with
      frozen kernel threads not responding to memory freeing requests.
      However, it overlooked the s2disk code path in which the
      SNAPSHOT_CREATE_IMAGE ioctl was run directly after SNAPSHOT_FREE,
      which caused freeze_workqueues_begin() to BUG(), because it saw
      that worqueues had been already frozen.
      
      Although in principle this issue might be addressed by removing
      the relevant BUG_ON() from freeze_workqueues_begin(), that would
      reintroduce the very problem that commit 2aede851
      attempted to avoid into that particular code path.  For this reason,
      to fix the issue at hand, introduce thaw_kernel_threads() and make
      the SNAPSHOT_FREE ioctl execute it.
      
      Special thanks to Srivatsa S. Bhat for detailed analysis of the
      problem.
      Reported-and-tested-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: stable@kernel.org
      181e9bde
  11. 26 Dec, 2011 1 commit
  12. 08 Dec, 2011 1 commit
  13. 06 Dec, 2011 1 commit
  14. 24 Nov, 2011 1 commit
  15. 23 Nov, 2011 2 commits
    • Oleg Nesterov's avatar
      freezer: fix wait_event_freezable/__thaw_task races · 24b7ead3
      Oleg Nesterov authored
      wait_event_freezable() and friends stop the waiting if try_to_freeze()
      fails. This is not right, we can race with __thaw_task() and in this
      case
      
      	- wait_event_freezable() returns the wrong ERESTARTSYS
      
      	- wait_event_freezable_timeout() can return the positive
      	  value while condition == F
      
      Change the code to always check __retval/condition before return.
      
      Note: with or without this patch the timeout logic looks strange,
      probably we should recalc timeout if try_to_freeze() returns T.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      24b7ead3
    • Tejun Heo's avatar
      freezer: kill unused set_freezable_with_signal() · 34b087e4
      Tejun Heo authored
      There's no in-kernel user of set_freezable_with_signal() left.  Mixing
      TIF_SIGPENDING with kernel threads can lead to nasty corner cases as
      kernel threads never travel signal delivery path on their own.
      
      e.g. the current implementation is buggy in the cancelation path of
      __thaw_task().  It calls recalc_sigpending_and_wake() in an attempt to
      clear TIF_SIGPENDING but the function never clears it regardless of
      sigpending state.  This means that signallable freezable kthreads may
      continue executing with !freezing() && stuck TIF_SIGPENDING, which can
      be troublesome.
      
      This patch removes set_freezable_with_signal() along with
      PF_FREEZER_NOSIG and recalc_sigpending*() calls in freezer.  User
      tasks get TIF_SIGPENDING, kernel tasks get woken up and the spurious
      sigpending is dealt with in the usual signal delivery path.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      34b087e4
  16. 21 Nov, 2011 9 commits
    • Tejun Heo's avatar
      freezer: remove unused @sig_only from freeze_task() · 839e3407
      Tejun Heo authored
      After "freezer: make freezing() test freeze conditions in effect
      instead of TIF_FREEZE", freezing() returns authoritative answer on
      whether the current task should freeze or not and freeze_task()
      doesn't need or use @sig_only.  Remove it.
      
      While at it, rewrite function comment for freeze_task() and rename
      @sig_only to @user_only in try_to_freeze_tasks().
      
      This patch doesn't cause any functional change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      839e3407
    • Tejun Heo's avatar
      freezer: fix set_freezable[_with_signal]() race · 96ee6d85
      Tejun Heo authored
      A kthread doing set_freezable*() may race with on-going PM freeze and
      the freezer might think all tasks are frozen while the new freezable
      kthread is merrily proceeding to execute code paths which aren't
      supposed to be executing during PM freeze.
      
      Reimplement set_freezable[_with_signal]() using __set_freezable() such
      that freezable PF flags are modified under freezer_lock and
      try_to_freeze() is called afterwards.  This eliminates race condition
      against freezing.
      
      Note: Separated out from larger patch to resolve fix order dependency
            Oleg pointed out.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      96ee6d85
    • Tejun Heo's avatar
      freezer: remove should_send_signal() and update frozen() · 948246f7
      Tejun Heo authored
      should_send_signal() is only used in freezer.c.  Exporting them only
      increases chance of abuse.  Open code the two users and remove it.
      
      Update frozen() to return bool.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      948246f7
    • Tejun Heo's avatar
      freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE · a3201227
      Tejun Heo authored
      Using TIF_FREEZE for freezing worked when there was only single
      freezing condition (the PM one); however, now there is also the
      cgroup_freezer and single bit flag is getting clumsy.
      thaw_processes() is already testing whether cgroup freezing in in
      effect to avoid thawing tasks which were frozen by both PM and cgroup
      freezers.
      
      This is racy (nothing prevents race against cgroup freezing) and
      fragile.  A much simpler way is to test actual freeze conditions from
      freezing() - ie. directly test whether PM or cgroup freezing is in
      effect.
      
      This patch adds variables to indicate whether and what type of
      freezing conditions are in effect and reimplements freezing() such
      that it directly tests whether any of the two freezing conditions is
      active and the task should freeze.  On fast path, freezing() is still
      very cheap - it only tests system_freezing_cnt.
      
      This makes the clumsy dancing aroung TIF_FREEZE unnecessary and
      freeze/thaw operations more usual - updating state variables for the
      new state and nudging target tasks so that they notice the new state
      and comply.  As long as the nudging happens after state update, it's
      race-free.
      
      * This allows use of freezing() in freeze_task().  Replace the open
        coded tests with freezing().
      
      * p != current test is added to warning printing conditions in
        try_to_freeze_tasks() failure path.  This is necessary as freezing()
        is now true for the task which initiated freezing too.
      
      -v2: Oleg pointed out that re-freezing FROZEN cgroup could increment
           system_freezing_cnt.  Fixed.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: Paul Menage <paul@paulmenage.org>  (for the cgroup portions)
      a3201227
    • Tejun Heo's avatar
      cgroup_freezer: prepare for removal of TIF_FREEZE · 22b4e111
      Tejun Heo authored
      TIF_FREEZE will be removed soon and freezing() will directly test
      whether any freezing condition is in effect.  Make the following
      changes in preparation.
      
      * Rename cgroup_freezing_or_frozen() to cgroup_freezing() and make it
        return bool.
      
      * Make cgroup_freezing() access task_freezer() under rcu read lock
        instead of task_lock().  This makes the state dereferencing racy
        against task moving to another cgroup; however, it was already racy
        without this change as ->state dereference wasn't synchronized.
        This will be later dealt with using attach hooks.
      
      * freezer->state is now set before trying to push tasks into the
        target state.
      
      -v2: Oleg pointed out that freeze_change_state() was setting
           freeze->state incorrectly to CGROUP_FROZEN instead of
           CGROUP_FREEZING.  Fixed.
      
      -v3: Matt pointed out that setting CGROUP_FROZEN used to always invoke
           try_to_freeze_cgroup() regardless of the current state.  Patch
           updated such that the actual freeze/thaw operations are always
           performed on invocation.  This shouldn't make any difference
           unless something is broken.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarPaul Menage <paul@paulmenage.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      22b4e111
    • Tejun Heo's avatar
      freezer: clean up freeze_processes() failure path · 03afed8b
      Tejun Heo authored
      freeze_processes() failure path is rather messy.  Freezing is canceled
      for workqueues and tasks which aren't frozen yet but frozen tasks are
      left alone and should be thawed by the caller and of course some
      callers (xen and kexec) didn't do it.
      
      This patch updates __thaw_task() to handle cancelation correctly and
      makes freeze_processes() and freeze_kernel_threads() call
      thaw_processes() on failure instead so that the system is fully thawed
      on failure.  Unnecessary [suspend_]thaw_processes() calls are removed
      from kernel/power/hibernate.c, suspend.c and user.c.
      
      While at it, restructure error checking if clause in suspend_prepare()
      to be less weird.
      
      -v2: Srivatsa spotted missing removal of suspend_thaw_processes() in
           suspend_prepare() and error in commit message.  Updated.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      03afed8b
    • Tejun Heo's avatar
      freezer: rename thaw_process() to __thaw_task() and simplify the implementation · a5be2d0d
      Tejun Heo authored
      thaw_process() now has only internal users - system and cgroup
      freezers.  Remove the unnecessary return value, rename, unexport and
      collapse __thaw_process() into it.  This will help further updates to
      the freezer code.
      
      -v3: oom_kill grew a use of thaw_process() while this patch was
           pending.  Convert it to use __thaw_task() for now.  In the longer
           term, this should be handled by allowing tasks to die if killed
           even if it's frozen.
      
      -v2: minor style update as suggested by Matt.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Paul Menage <menage@google.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      a5be2d0d
    • Tejun Heo's avatar
      freezer: implement and use kthread_freezable_should_stop() · 8a32c441
      Tejun Heo authored
      Writeback and thinkpad_acpi have been using thaw_process() to prevent
      deadlock between the freezer and kthread_stop(); unfortunately, this
      is inherently racy - nothing prevents freezing from happening between
      thaw_process() and kthread_stop().
      
      This patch implements kthread_freezable_should_stop() which enters
      refrigerator if necessary but is guaranteed to return if
      kthread_stop() is invoked.  Both thaw_process() users are converted to
      use the new function.
      
      Note that this deadlock condition exists for many of freezable
      kthreads.  They need to be converted to use the new should_stop or
      freezable workqueue.
      
      Tested with synthetic test case.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarHenrique de Moraes Holschuh <ibm-acpi@hmh.eng.br>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      8a32c441
    • Tejun Heo's avatar
      freezer: unexport refrigerator() and update try_to_freeze() slightly · a0acae0e
      Tejun Heo authored
      There is no reason to export two functions for entering the
      refrigerator.  Calling refrigerator() instead of try_to_freeze()
      doesn't save anything noticeable or removes any race condition.
      
      * Rename refrigerator() to __refrigerator() and make it return bool
        indicating whether it scheduled out for freezing.
      
      * Update try_to_freeze() to return bool and relay the return value of
        __refrigerator() if freezing().
      
      * Convert all refrigerator() users to try_to_freeze().
      
      * Update documentation accordingly.
      
      * While at it, add might_sleep() to try_to_freeze().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Samuel Ortiz <samuel@sortiz.org>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Christoph Hellwig <hch@infradead.org>
      a0acae0e
  17. 04 Nov, 2011 1 commit
    • Oleg Nesterov's avatar
      PM / Freezer: Reimplement wait_event_freezekillable using freezer_do_not_count/freezer_count · 6f35c4ab
      Oleg Nesterov authored
      Commit 27920651 "PM / Freezer: Make fake_signal_wake_up() wake
      TASK_KILLABLE tasks too" updated fake_signal_wake_up() used by freezer
      to wake up KILLABLE tasks.  Sending unsolicited wakeups to tasks in
      killable sleep is dangerous as there are code paths which depend on
      tasks not waking up spuriously from KILLABLE sleep.
      
      For example. sys_read() or page can sleep in TASK_KILLABLE assuming
      that wait/down/whatever _killable can only fail if we can not return
      to the usermode.  TASK_TRACED is another obvious example.
      
      The offending commit was to resolve freezer hang during system PM
      operations caused by KILLABLE sleeps in network filesystems.
      wait_event_freezekillable(), which depends on the spurious KILLABLE
      wakeup, was added by f06ac72e "cifs, freezer: add
      wait_event_freezekillable and have cifs use it" to be used to
      implement killable & freezable sleeps in network filesystems.
      
      To prepare for reverting of 27920651, this patch reimplements
      wait_event_freezekillable() using freezer_do_not_count/freezer_count()
      so that it doesn't depend on the spurious KILLABLE wakeup.  This isn't
      very nice but should do for now.
      
      [tj: Refreshed patch to apply to linus/master and updated commit
          description on Rafael's request.]
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      6f35c4ab
  18. 27 Oct, 2011 1 commit
  19. 19 Oct, 2011 2 commits
  20. 16 Oct, 2011 1 commit
    • Rafael J. Wysocki's avatar
      PM / Hibernate: Freeze kernel threads after preallocating memory · 2aede851
      Rafael J. Wysocki authored
      There is a problem with the current ordering of hibernate code which
      leads to deadlocks in some filesystems' memory shrinkers.  Namely,
      some filesystems use freezable kernel threads that are inactive when
      the hibernate memory preallocation is carried out.  Those same
      filesystems use memory shrinkers that may be triggered by the
      hibernate memory preallocation.  If those memory shrinkers wait for
      the frozen kernel threads, the hibernate process deadlocks (this
      happens with XFS, for one example).
      
      Apparently, it is not technically viable to redesign the filesystems
      in question to avoid the situation described above, so the only
      possible solution of this issue is to defer the freezing of kernel
      threads until the hibernate memory preallocation is done, which is
      implemented by this change.
      
      Unfortunately, this requires the memory preallocation to be done
      before the "prepare" stage of device freeze, so after this change the
      only way drivers can allocate additional memory for their freeze
      routines in a clean way is to use PM notifiers.
      Reported-by: default avatarChristoph <cr2005@u-club.de>
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      2aede851
  21. 16 Feb, 2011 1 commit
  22. 26 Mar, 2010 1 commit
    • Matt Helsley's avatar
      Freezer: Fix buggy resume test for tasks frozen with cgroup freezer · 5a7aadfe
      Matt Helsley authored
      When the cgroup freezer is used to freeze tasks we do not want to thaw
      those tasks during resume. Currently we test the cgroup freezer
      state of the resuming tasks to see if the cgroup is FROZEN.  If so
      then we don't thaw the task. However, the FREEZING state also indicates
      that the task should remain frozen.
      
      This also avoids a problem pointed out by Oren Ladaan: the freezer state
      transition from FREEZING to FROZEN is updated lazily when userspace reads
      or writes the freezer.state file in the cgroup filesystem. This means that
      resume will thaw tasks in cgroups which should be in the FROZEN state if
      there is no read/write of the freezer.state file to trigger this
      transition before suspend.
      
      NOTE: Another "simple" solution would be to always update the cgroup
      freezer state during resume. However it's a bad choice for several reasons:
      Updating the cgroup freezer state is somewhat expensive because it requires
      walking all the tasks in the cgroup and checking if they are each frozen.
      Worse, this could easily make resume run in N^2 time where N is the number
      of tasks in the cgroup. Finally, updating the freezer state from this code
      path requires trickier locking because of the way locks must be ordered.
      
      Instead of updating the freezer state we rely on the fact that lazy
      updates only manage the transition from FREEZING to FROZEN. We know that
      a cgroup with the FREEZING state may actually be FROZEN so test for that
      state too. This makes sense in the resume path even for partially-frozen
      cgroups -- those that really are FREEZING but not FROZEN.
      Reported-by: default avatarOren Ladaan <orenl@cs.columbia.edu>
      Signed-off-by: default avatarMatt Helsley <matthltc@us.ibm.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      5a7aadfe
  23. 30 Oct, 2008 1 commit
  24. 20 Oct, 2008 1 commit
    • Matt Helsley's avatar
      container freezer: implement freezer cgroup subsystem · dc52ddc0
      Matt Helsley authored
      This patch implements a new freezer subsystem in the control groups
      framework.  It provides a way to stop and resume execution of all tasks in
      a cgroup by writing in the cgroup filesystem.
      
      The freezer subsystem in the container filesystem defines a file named
      freezer.state.  Writing "FROZEN" to the state file will freeze all tasks
      in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
      the cgroup.  Reading will return the current state.
      
      * Examples of usage :
      
         # mkdir /containers/freezer
         # mount -t cgroup -ofreezer freezer  /containers
         # mkdir /containers/0
         # echo $some_pid > /containers/0/tasks
      
      to get status of the freezer subsystem :
      
         # cat /containers/0/freezer.state
         RUNNING
      
      to freeze all tasks in the container :
      
         # echo FROZEN > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         FREEZING
         # cat /containers/0/freezer.state
         FROZEN
      
      to unfreeze all tasks in the container :
      
         # echo RUNNING > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         RUNNING
      
      This is the basic mechanism which should do the right thing for user space
      task in a simple scenario.
      
      It's important to note that freezing can be incomplete.  In that case we
      return EBUSY.  This means that some tasks in the cgroup are busy doing
      something that prevents us from completely freezing the cgroup at this
      time.  After EBUSY, the cgroup will remain partially frozen -- reflected
      by freezer.state reporting "FREEZING" when read.  The state will remain
      "FREEZING" until one of these things happens:
      
      	1) Userspace cancels the freezing operation by writing "RUNNING" to
      		the freezer.state file
      	2) Userspace retries the freezing operation by writing "FROZEN" to
      		the freezer.state file (writing "FREEZING" is not legal
      		and returns EIO)
      	3) The tasks that blocked the cgroup from entering the "FROZEN"
      		state disappear from the cgroup's set of tasks.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: export thaw_process]
      Signed-off-by: default avatarCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: default avatarMatt Helsley <matthltc@us.ibm.com>
      Acked-by: default avatarSerge E. Hallyn <serue@us.ibm.com>
      Tested-by: default avatarMatt Helsley <matthltc@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dc52ddc0