linux.git
12 years agoring-buffer: Convert reader_lock from raw_spin_lock into spin_lock
Steven Rostedt [Tue, 27 Sep 2011 17:56:50 +0000 (13:56 -0400)]
ring-buffer: Convert reader_lock from raw_spin_lock into spin_lock

The reader_lock is mostly taken in normal context with interrupts enabled.
But because ftrace_dump() can happen anywhere, it is used as a spin lock
and in some cases a check to in_nmi() is performed to determine if the
ftrace_dump() was initiated from an NMI and if it is, the lock is not taken.

But having the lock as a raw_spin_lock() causes issues with the real-time
kernel as the lock is held during allocation and freeing of the buffer.
As memory locks convert into mutexes, keeping the reader_lock as a spin_lock
causes problems.

Converting the reader_lock is not straight forward as we must still deal
with the ftrace_dump() happening not only from an NMI but also from
true interrupt context in PREEPMT_RT.

Two wrapper functions are created to take and release the reader lock:

  int read_buffer_lock(cpu_buffer, unsigned long *flags)
  void read_buffer_unlock(cpu_buffer, unsigned long flags, int locked)

The read_buffer_lock() returns 1 if it actually took the lock, disables
interrupts and updates the flags. The only time it returns 0 is in the
case of a ftrace_dump() happening in an unsafe context.

The read_buffer_unlock() checks the return of locked and will simply
unlock the spin lock if it was successfully taken.

Instead of just having this in specific cases that the NMI might call
into, all instances of the reader_lock is converted to the wrapper
functions to make this a bit simpler to read and less error prone.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Clark Williams <clark@redhat.com>
Link: http://lkml.kernel.org/r/1317146210.26514.33.camel@gandalf.stny.rr.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoftrace-crap.patch
Thomas Gleixner [Fri, 9 Sep 2011 14:55:53 +0000 (16:55 +0200)]
ftrace-crap.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched-clear-pf-thread-bound-on-fallback-rq.patch
Thomas Gleixner [Fri, 4 Nov 2011 19:48:36 +0000 (20:48 +0100)]
sched-clear-pf-thread-bound-on-fallback-rq.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched: Have migrate_disable ignore bounded threads
Peter Zijlstra [Tue, 27 Sep 2011 12:40:25 +0000 (08:40 -0400)]
sched: Have migrate_disable ignore bounded threads

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Clark Williams <williams@redhat.com>
Link: http://lkml.kernel.org/r/20110927124423.567944215@goodmis.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched: Do not compare cpu masks in scheduler
Peter Zijlstra [Tue, 27 Sep 2011 12:40:24 +0000 (08:40 -0400)]
sched: Do not compare cpu masks in scheduler

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Clark Williams <williams@redhat.com>
Link: http://lkml.kernel.org/r/20110927124423.128129033@goodmis.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched: Postpone actual migration disalbe to schedule
Steven Rostedt [Tue, 27 Sep 2011 12:40:23 +0000 (08:40 -0400)]
sched: Postpone actual migration disalbe to schedule

The migrate_disable() can cause a bit of a overhead to the RT kernel,
as changing the affinity is expensive to do at every lock encountered.
As a running task can not migrate, the actual disabling of migration
does not need to occur until the task is about to schedule out.

In most cases, a task that disables migration will enable it before
it schedules making this change improve performance tremendously.

[ Frank Rowand: UP compile fix ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Clark Williams <williams@redhat.com>
Link: http://lkml.kernel.org/r/20110927124422.779693167@goodmis.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched: teach migrate_disable about atomic contexts
Peter Zijlstra [Fri, 2 Sep 2011 12:29:27 +0000 (14:29 +0200)]
sched: teach migrate_disable about atomic contexts

 <NMI>  [<ffffffff812dafd8>] spin_bug+0x94/0xa8
 [<ffffffff812db07f>] do_raw_spin_lock+0x43/0xea
 [<ffffffff814fa9be>] _raw_spin_lock_irqsave+0x6b/0x85
 [<ffffffff8106ff9e>] ? migrate_disable+0x75/0x12d
 [<ffffffff81078aaf>] ? pin_current_cpu+0x36/0xb0
 [<ffffffff8106ff9e>] migrate_disable+0x75/0x12d
 [<ffffffff81115b9d>] pagefault_disable+0xe/0x1f
 [<ffffffff81047027>] copy_from_user_nmi+0x74/0xe6
 [<ffffffff810489d7>] perf_callchain_user+0xf3/0x135

Now clearly we can't go around taking locks from NMI context, cure
this by short-circuiting migrate_disable() when we're in an atomic
context already.

Add some extra debugging to avoid things like:

  preempt_disable()
  migrate_disable();

  preempt_enable();
  migrate_enable();

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1314967297.1301.14.camel@twins
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/n/tip-wbot4vsmwhi8vmbf83hsclk6@git.kernel.org
12 years agosched, rt: Fix migrate_enable() thinko
Mike Galbraith [Tue, 23 Aug 2011 14:12:43 +0000 (16:12 +0200)]
sched, rt: Fix migrate_enable() thinko

Assigning mask = tsk_cpus_allowed(p) after p->migrate_disable = 0 ensures
that we won't see a mask change.. no push/pull, we stack tasks on one CPU.

Also add a couple fields to sched_debug for the next guy.

[ Build fix from Stratos Psomadakis <psomas@gentoo.org> ]

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1314108763.6689.4.camel@marge.simson.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched: Generic migrate_disable
Peter Zijlstra [Thu, 11 Aug 2011 13:14:58 +0000 (15:14 +0200)]
sched: Generic migrate_disable

Make migrate_disable() be a preempt_disable() for !rt kernels. This
allows generic code to use it but still enforces that these code
sections stay relatively small.

A preemptible migrate_disable() accessible for general use would allow
people growing arbitrary per-cpu crap instead of clean these things
up.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-275i87sl8e1jcamtchmehonm@git.kernel.org
12 years agosched: Optimize migrate_disable
Peter Zijlstra [Thu, 11 Aug 2011 13:03:35 +0000 (15:03 +0200)]
sched: Optimize migrate_disable

Change from task_rq_lock() to raw_spin_lock(&rq->lock) to avoid a few
atomic ops. See comment on why it should be safe.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-cbz6hkl5r5mvwtx5s3tor2y6@git.kernel.org
12 years agomigrate-disable-rt-variant.patch
Thomas Gleixner [Sun, 17 Jul 2011 17:48:20 +0000 (19:48 +0200)]
migrate-disable-rt-variant.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agotracing: Show padding as unsigned short
Steven Rostedt [Wed, 16 Nov 2011 18:19:35 +0000 (13:19 -0500)]
tracing: Show padding as unsigned short

RT added two bytes to trace migrate disable counting to the trace events
and used two bytes of the padding to make the change. The structures and
all were updated correctly, but the display in the event formats was
not:

cat /debug/tracing/events/sched/sched_switch/format

name: sched_switch
ID: 51
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:unsigned short common_migrate_disable; offset:8; size:2; signed:0;
field:int common_padding; offset:10; size:2; signed:0;

The field for common_padding has the correct size and offset, but the
use of "int" might confuse some parsers (and people that are reading
it). This needs to be changed to "unsigned short".

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1321467575.4181.36.camel@frodo
Cc: stable-rt@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoftrace-migrate-disable-tracing.patch
Thomas Gleixner [Sun, 17 Jul 2011 19:56:42 +0000 (21:56 +0200)]
ftrace-migrate-disable-tracing.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agohotplug: Call cpu_unplug_begin() before DOWN_PREPARE
Yong Zhang [Sun, 16 Oct 2011 10:56:44 +0000 (18:56 +0800)]
hotplug: Call cpu_unplug_begin() before DOWN_PREPARE

cpu_unplug_begin() should be called before CPU_DOWN_PREPARE, because
at CPU_DOWN_PREPARE cpu_active is cleared and sched_domain is
rebuilt. Otherwise the 'sync_unplug' thread will be running on the cpu
on which it's created and not bound on the cpu which is about to go
down.

I found that by an incorrect warning on smp_processor_id() called by
sync_unplug/1, and trace shows below:
(echo 1 > /sys/device/system/cpu/cpu1/online)
  bash-1664  [000]    83.136620: _cpu_down: Bind sync_unplug to cpu 1
  bash-1664  [000]    83.136623: sched_wait_task: comm=sync_unplug/1 pid=1724 prio=120
  bash-1664  [000]    83.136624: _cpu_down: Wake sync_unplug
  bash-1664  [000]    83.136629: sched_wakeup: comm=sync_unplug/1 pid=1724 prio=120 success=1 target_cpu=000

Wants to be folded back....

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Link: http://lkml.kernel.org/r/1318762607-2261-3-git-send-email-yong.zhang0@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agohotplug-use-migrate-disable.patch
Thomas Gleixner [Sun, 17 Jul 2011 17:35:29 +0000 (19:35 +0200)]
hotplug-use-migrate-disable.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched-migrate-disable.patch
Thomas Gleixner [Thu, 16 Jun 2011 11:26:08 +0000 (13:26 +0200)]
sched-migrate-disable.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agohotplug: Reread hotplug_pcp on pin_current_cpu() retry
Yong Zhang [Thu, 28 Jul 2011 03:16:00 +0000 (11:16 +0800)]
hotplug: Reread hotplug_pcp on pin_current_cpu() retry

When retry happens, it's likely that the task has been migrated to
another cpu (except unplug failed), but it still derefernces the
original hotplug_pcp per cpu data.

Update the pointer to hotplug_pcp in the retry path, so it points to
the current cpu.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110728031600.GA338@windriver.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agohotplug: sync_unplug: No "\n" in task name
Yong Zhang [Sun, 16 Oct 2011 10:56:43 +0000 (18:56 +0800)]
hotplug: sync_unplug: No "\n" in task name

Otherwise the output will look a little odd.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Link: http://lkml.kernel.org/r/1318762607-2261-2-git-send-email-yong.zhang0@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agohotplug: Lightweight get online cpus
Thomas Gleixner [Wed, 15 Jun 2011 10:36:06 +0000 (12:36 +0200)]
hotplug: Lightweight get online cpus

get_online_cpus() is a heavy weight function which involves a global
mutex. migrate_disable() wants a simpler construct which prevents only
a CPU from going doing while a task is in a migrate disabled section.

Implement a per cpu lockless mechanism, which serializes only in the
real unplug case on a global mutex. That serialization affects only
tasks on the cpu which should be brought down.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agostomp-machine-raw-lock.patch
Thomas Gleixner [Wed, 29 Jun 2011 09:01:51 +0000 (11:01 +0200)]
stomp-machine-raw-lock.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agostomp-machine-mark-stomper-thread.patch
Thomas Gleixner [Sun, 17 Jul 2011 17:53:19 +0000 (19:53 +0200)]
stomp-machine-mark-stomper-thread.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agostop_machine: convert stop_machine_run() to PREEMPT_RT
Ingo Molnar [Fri, 3 Jul 2009 13:30:27 +0000 (08:30 -0500)]
stop_machine: convert stop_machine_run() to PREEMPT_RT

Instead of playing with non-preemption, introduce explicit
startup serialization. This is more robust and cleaner as
well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched: ttwu: Return success when only changing the saved_state value
Thomas Gleixner [Tue, 13 Dec 2011 20:42:19 +0000 (21:42 +0100)]
sched: ttwu: Return success when only changing the saved_state value

When a task blocks on a rt lock, it saves the current state in
p->saved_state, so a lock related wake up will not destroy the
original state.

When a real wakeup happens, while the task is running due to a lock
wakeup already, we update p->saved_state to TASK_RUNNING, but we do
not return success, which might cause another wakeup in the waitqueue
code and the task remains in the waitqueue list. Return success in
that case as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
12 years agosched: Disable CONFIG_RT_GROUP_SCHED on RT
Thomas Gleixner [Mon, 18 Jul 2011 15:03:52 +0000 (17:03 +0200)]
sched: Disable CONFIG_RT_GROUP_SCHED on RT

Carsten reported problems when running:

taskset 01 chrt -f 1 sleep 1

from within rc.local on a F15 machine. The task stays running and
never gets on the run queue because some of the run queues have
rt_throttled=1 which does not go away. Works nice from a ssh login
shell. Disabling CONFIG_RT_GROUP_SCHED solves that as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched-disable-ttwu-queue.patch
Thomas Gleixner [Tue, 13 Sep 2011 14:42:35 +0000 (16:42 +0200)]
sched-disable-ttwu-queue.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agocond-resched-lock-rt-tweak.patch
Thomas Gleixner [Sun, 17 Jul 2011 20:51:33 +0000 (22:51 +0200)]
cond-resched-lock-rt-tweak.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched-no-work-when-pi-blocked.patch
Thomas Gleixner [Sun, 17 Jul 2011 18:46:52 +0000 (20:46 +0200)]
sched-no-work-when-pi-blocked.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agocond-resched-softirq-fix.patch
Thomas Gleixner [Thu, 14 Jul 2011 07:56:44 +0000 (09:56 +0200)]
cond-resched-softirq-fix.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched-cond-resched.patch
Thomas Gleixner [Tue, 7 Jun 2011 09:25:03 +0000 (11:25 +0200)]
sched-cond-resched.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched: Break out from load_balancing on rq_lock contention
Peter Zijlstra [Tue, 16 Mar 2010 21:31:44 +0000 (14:31 -0700)]
sched: Break out from load_balancing on rq_lock contention

Also limit NEW_IDLE pull

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched-might-sleep-do-not-account-rcu-depth.patch
Thomas Gleixner [Tue, 7 Jun 2011 07:19:06 +0000 (09:19 +0200)]
sched-might-sleep-do-not-account-rcu-depth.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched-prevent-idle-boost.patch
Thomas Gleixner [Mon, 6 Jun 2011 18:07:38 +0000 (20:07 +0200)]
sched-prevent-idle-boost.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched-rt-mutex-wakeup.patch
Thomas Gleixner [Sat, 25 Jun 2011 07:21:04 +0000 (09:21 +0200)]
sched-rt-mutex-wakeup.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched-mmdrop-delayed.patch
Thomas Gleixner [Mon, 6 Jun 2011 10:20:33 +0000 (12:20 +0200)]
sched-mmdrop-delayed.patch

Needs thread context (pgd_lock) -> ifdeffed. workqueues wont work with
RT

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched-limit-nr-migrate.patch
Thomas Gleixner [Mon, 6 Jun 2011 10:12:51 +0000 (12:12 +0200)]
sched-limit-nr-migrate.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosched-delay-put-task.patch
Thomas Gleixner [Tue, 31 May 2011 14:59:16 +0000 (16:59 +0200)]
sched-delay-put-task.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoposix-timers: Avoid wakeups when no timers are active
Thomas Gleixner [Fri, 3 Jul 2009 13:44:44 +0000 (08:44 -0500)]
posix-timers: Avoid wakeups when no timers are active

Waking the thread even when no timers are scheduled is useless.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoposix-timers: Shorten posix_cpu_timers/<CPU> kernel thread names
Arnaldo Carvalho de Melo [Fri, 3 Jul 2009 13:30:00 +0000 (08:30 -0500)]
posix-timers: Shorten posix_cpu_timers/<CPU> kernel thread names

Shorten the softirq kernel thread names because they always overflow the
limited comm length, appearing as "posix_cpu_timer" CPU# times.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoposix-timers: thread posix-cpu-timers on -rt
John Stultz [Fri, 3 Jul 2009 13:29:58 +0000 (08:29 -0500)]
posix-timers: thread posix-cpu-timers on -rt

posix-cpu-timer code takes non -rt safe locks in hard irq
context. Move it to a thread.

[ 3.0 fixes from Peter Zijlstra <peterz@infradead.org> ]

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agotimer-fd: Prevent live lock
Thomas Gleixner [Wed, 25 Jan 2012 10:08:40 +0000 (11:08 +0100)]
timer-fd: Prevent live lock

If hrtimer_try_to_cancel() requires a retry, then depending on the
priority setting te retry loop might prevent timer callback completion
on RT. Prevent that by waiting for completion on RT, no change for a
non RT kernel.

Reported-by: Sankara Muthukrishnan <sankara.m@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
12 years agohrtimer-fix-reprogram-madness.patch
Thomas Gleixner [Wed, 14 Sep 2011 12:48:43 +0000 (14:48 +0200)]
hrtimer-fix-reprogram-madness.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agohrtimer: Add missing debug_activate() aid [Was: Re: [ANNOUNCE] 3.0.6-rt17]
Yong Zhang [Thu, 13 Oct 2011 07:52:30 +0000 (15:52 +0800)]
hrtimer: Add missing debug_activate() aid [Was: Re: [ANNOUNCE] 3.0.6-rt17]

On Fri, Oct 07, 2011 at 10:25:25AM -0700, Fernando Lopez-Lezcano wrote:
> On 10/06/2011 06:15 PM, Thomas Gleixner wrote:
> >Dear RT Folks,
> >
> >I'm pleased to announce the 3.0.6-rt17 release.
>
> Hi and thanks again. So far this one is not hanging which is very
> good news. But I still see the hrtimer_fixup_activate warnings I
> reported for rt16...

Hi Fernando,

I think below patch will smooth your concern?

Thanks,
Yong

12 years agohrtimer: Don't call the timer handler from hrtimer_start
Peter Zijlstra [Fri, 12 Aug 2011 15:39:54 +0000 (17:39 +0200)]
hrtimer: Don't call the timer handler from hrtimer_start

 [<ffffffff812de4a9>] __delay+0xf/0x11
 [<ffffffff812e36e9>] do_raw_spin_lock+0xd2/0x13c
 [<ffffffff815028ee>] _raw_spin_lock+0x60/0x73              rt_b->rt_runtime_lock
 [<ffffffff81068f68>] ? sched_rt_period_timer+0xad/0x281
 [<ffffffff81068f68>] sched_rt_period_timer+0xad/0x281
 [<ffffffff8109e5e1>] __run_hrtimer+0x1e4/0x347
 [<ffffffff81068ebb>] ? enqueue_rt_entity+0x36/0x36
 [<ffffffff8109f2b1>] __hrtimer_start_range_ns+0x2b5/0x40a  base->cpu_base->lock  (lock_hrtimer_base)
 [<ffffffff81068b6f>] __enqueue_rt_entity+0x26f/0x2aa       rt_b->rt_runtime_lock (start_rt_bandwidth)
 [<ffffffff81068ead>] enqueue_rt_entity+0x28/0x36
 [<ffffffff81069355>] enqueue_task_rt+0x3d/0xb0
 [<ffffffff810679d6>] enqueue_task+0x5d/0x64
 [<ffffffff810714fc>] task_setprio+0x210/0x29c              rq->lock
 [<ffffffff810b56cb>] __rt_mutex_adjust_prio+0x25/0x2a      p->pi_lock
 [<ffffffff810b5d2c>] task_blocks_on_rt_mutex+0x196/0x20f

Instead make __hrtimer_start_range_ns() return -ETIME when the timer
is in the past. Since body actually uses the hrtimer_start*() return
value its pretty safe to wreck it.

Also, it will only ever return -ETIME for timer->irqsafe || !wakeup
timers.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
12 years agohrtimer: fixup hrtimer callback changes for preempt-rt
Thomas Gleixner [Fri, 3 Jul 2009 13:44:31 +0000 (08:44 -0500)]
hrtimer: fixup hrtimer callback changes for preempt-rt

In preempt-rt we can not call the callbacks which take sleeping locks
from the timer interrupt context.

Bring back the softirq split for now, until we fixed the signal
delivery problem for real.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agohrtimers: prepare full preemption
Ingo Molnar [Fri, 3 Jul 2009 13:29:34 +0000 (08:29 -0500)]
hrtimers: prepare full preemption

Make cancellation of a running callback in softirq context safe
against preemption.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoprintk: Don't call printk_tick in printk_needs_cpu() on RT
Yong Zhang [Sun, 16 Oct 2011 10:56:45 +0000 (18:56 +0800)]
printk: Don't call printk_tick in printk_needs_cpu() on RT

printk_tick() can't be called in atomic context when RT is enabled,
otherwise below warning will show:

[  117.597095] BUG: sleeping function called from invalid context at kernel/rtmutex.c:645
[  117.597102] in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: kworker/0:0
[  117.597111] Pid: 0, comm: kworker/0:0 Not tainted 3.0.6-rt17-00284-gb76d419-dirty #7
[  117.597116] Call Trace:
[  117.597131]  [<c06e3b61>] ? printk+0x1d/0x24
[  117.597142]  [<c01390b6>] __might_sleep+0xe6/0x110
[  117.597151]  [<c06e634c>] rt_spin_lock+0x1c/0x30
[  117.597158]  [<c0142f26>] __wake_up+0x26/0x60
[  117.597166]  [<c014c78e>] printk_tick+0x3e/0x40
[  117.597173]  [<c014c7b4>] printk_needs_cpu+0x24/0x30
[  117.597181]  [<c017ecc8>] tick_nohz_stop_sched_tick+0x2e8/0x410
[  117.597191]  [<c017305a>] ? sched_clock_idle_wakeup_event+0x1a/0x20
[  117.597201]  [<c010182a>] cpu_idle+0x4a/0xb0
[  117.597209]  [<c06e0b97>] start_secondary+0xd3/0xd7

Now this is a really rare case and it's very unlikely that we starve
an logbuf waiter that way.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Link: http://lkml.kernel.org/r/1318762607-2261-4-git-send-email-yong.zhang0@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agotimers: Avoid the switch timers base set to NULL trick on RT
Thomas Gleixner [Thu, 21 Jul 2011 13:23:39 +0000 (15:23 +0200)]
timers: Avoid the switch timers base set to NULL trick on RT

On RT that code is preemptible, so we cannot assign NULL to timers
base as a preempter would spin forever in lock_timer_base().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agotimer: delay waking softirqs from the jiffy tick
Peter Zijlstra [Fri, 21 Aug 2009 09:56:45 +0000 (11:56 +0200)]
timer: delay waking softirqs from the jiffy tick

People were complaining about broken balancing with the recent -rt
series.

A look at /proc/sched_debug yielded:

cpu#0, 2393.874 MHz
  .nr_running                    : 0
  .load                          : 0
  .cpu_load[0]                   : 177522
  .cpu_load[1]                   : 177522
  .cpu_load[2]                   : 177522
  .cpu_load[3]                   : 177522
  .cpu_load[4]                   : 177522
cpu#1, 2393.874 MHz
  .nr_running                    : 4
  .load                          : 4096
  .cpu_load[0]                   : 181618
  .cpu_load[1]                   : 180850
  .cpu_load[2]                   : 180274
  .cpu_load[3]                   : 179938
  .cpu_load[4]                   : 179758

Which indicated the cpu_load computation was hosed, the 177522 value
indicates that there is one RT task runnable. Initially I thought the
old problem of calculating the cpu_load from a softirq had re-surfaced,
however looking at the code shows its being done from scheduler_tick().

[ we really should fix this RT/cfs interaction some day... ]

A few trace_printk()s later:

    sirq-timer/1-19    [001]   174.289744:     19: 50:S ==> [001]     0:140:R <idle>
          <idle>-0     [001]   174.290724: enqueue_task_rt: adding task: 19/sirq-timer/1 with load: 177522
          <idle>-0     [001]   174.290725:      0:140:R   + [001]    19: 50:S sirq-timer/1
          <idle>-0     [001]   174.290730: scheduler_tick: current load: 177522
          <idle>-0     [001]   174.290732: scheduler_tick: current: 0/swapper
          <idle>-0     [001]   174.290736:      0:140:R ==> [001]    19: 50:R sirq-timer/1
    sirq-timer/1-19    [001]   174.290741: dequeue_task_rt: removing task: 19/sirq-timer/1 with load: 177522
    sirq-timer/1-19    [001]   174.290743:     19: 50:S ==> [001]     0:140:R <idle>

We see that we always raise the timer softirq before doing the load
calculation. Avoid this by re-ordering the scheduler_tick() call in
update_process_times() to occur before we deal with timers.

This lowers the load back to sanity and restores regular load-balancing
behaviour.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agotimers: mov printk_tick to soft interrupt
Thomas Gleixner [Fri, 3 Jul 2009 13:44:30 +0000 (08:44 -0500)]
timers: mov printk_tick to soft interrupt

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agotimers: fix timer hotplug on -rt
Ingo Molnar [Fri, 3 Jul 2009 13:30:32 +0000 (08:30 -0500)]
timers: fix timer hotplug on -rt

Here we are in the CPU_DEAD notifier, and we must not sleep nor
enable interrupts.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agotimers: preempt-rt support
Ingo Molnar [Fri, 3 Jul 2009 13:30:20 +0000 (08:30 -0500)]
timers: preempt-rt support

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agotimers: prepare for full preemption
Ingo Molnar [Fri, 3 Jul 2009 13:29:34 +0000 (08:29 -0500)]
timers: prepare for full preemption

When softirqs can be preempted we need to make sure that cancelling
the timer from the active thread can not deadlock vs. a running timer
callback. Add a waitqueue to resolve that.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoworkqueue-avoid-the-lock-in-cpu-dying.patch
Thomas Gleixner [Fri, 24 Jun 2011 18:39:24 +0000 (20:39 +0200)]
workqueue-avoid-the-lock-in-cpu-dying.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agonet-ipv4-route-use-locks-on-up-rt.patch
Thomas Gleixner [Fri, 15 Jul 2011 14:24:45 +0000 (16:24 +0200)]
net-ipv4-route-use-locks-on-up-rt.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agorelay: fix timer madness
Ingo Molnar [Fri, 3 Jul 2009 13:44:07 +0000 (08:44 -0500)]
relay: fix timer madness

remove timer calls (!!!) from deep within the tracing infrastructure.
This was totally bogus code that can cause lockups and worse.  Poll
the buffer every 2 jiffies for now.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoipc/mqueue: Add a critical section to avoid a deadlock
KOBAYASHI Yoshitake [Sat, 23 Jul 2011 02:57:36 +0000 (11:57 +0900)]
ipc/mqueue: Add a critical section to avoid a deadlock

(Repost for v3.0-rt1 and changed the distination addreses)
I have tested the following patch on v3.0-rt1 with PREEMPT_RT_FULL.
In POSIX message queue, if a sender process uses SCHED_FIFO and
has a higher priority than a receiver process, the sender will
be stuck at ipc/mqueue.c:452

  452                 while (ewp->state == STATE_PENDING)
  453                         cpu_relax();

Description of the problem
 (receiver process)
   1. receiver changes sender's state to STATE_PENDING (mqueue.c:846)
   2. wake up sender process and "switch to sender" (mqueue.c:847)
      Note: This context switch only happens in PREEMPT_RT_FULL kernel.
 (sender process)
   3. sender check the own state in above loop (mqueue.c:452-453)
   *. receiver will never wake up and cannot change sender's state to
      STATE_READY because sender has higher priority

Signed-off-by: Yoshitake Kobayashi <yoshitake.kobayashi@toshiba.co.jp>
Cc: viro@zeniv.linux.org.uk
Cc: dchinner@redhat.com
Cc: npiggin@kernel.dk
Cc: hch@lst.de
Cc: arnd@arndb.de
Link: http://lkml.kernel.org/r/4E2A38A0.1090601@toshiba.co.jp
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoipc: Make the ipc code -rt aware
Ingo Molnar [Fri, 3 Jul 2009 13:30:12 +0000 (08:30 -0500)]
ipc: Make the ipc code -rt aware

RT serializes the code with the (rt)spinlock but keeps preemption
enabled. Some parts of the code need to be atomic nevertheless.

Protect it with preempt_disable/enable_rt pairts.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agopanic-disable-random-on-rt
Thomas Gleixner [Wed, 15 Feb 2012 16:32:43 +0000 (10:32 -0600)]
panic-disable-random-on-rt

12 years agoradix-tree-rt-aware.patch
Thomas Gleixner [Sun, 17 Jul 2011 19:33:18 +0000 (21:33 +0200)]
radix-tree-rt-aware.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm: Allow only slab on RT
Ingo Molnar [Fri, 3 Jul 2009 13:44:03 +0000 (08:44 -0500)]
mm: Allow only slab on RT

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoARM: Initialize ptl->lock for vector page
Frank Rowand [Sun, 2 Oct 2011 01:58:13 +0000 (18:58 -0700)]
ARM: Initialize ptl->lock for vector page

Without this patch, ARM can not use SPLIT_PTLOCK_CPUS if
PREEMPT_RT_FULL=y because vectors_user_mapping() creates a
VM_ALWAYSDUMP mapping of the vector page (address 0xffff0000), but no
ptl->lock has been allocated for the page.  An attempt to coredump
that page will result in a kernel NULL pointer dereference when
follow_page() attempts to lock the page.

The call tree to the NULL pointer dereference is:

   do_notify_resume()
      get_signal_to_deliver()
         do_coredump()
            elf_core_dump()
               get_dump_page()
                  __get_user_pages()
                     follow_page()
                        pte_offset_map_lock() <----- a #define
                           ...
                              rt_spin_lock()

The underlying problem is exposed by mm-shrink-the-page-frame-to-rt-size.patch.

Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Frank <Frank_Rowand@sonyusa.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/4E87C535.2030907@am.sony.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm: shrink the page frame to !-rt size
Peter Zijlstra [Fri, 3 Jul 2009 13:44:54 +0000 (08:44 -0500)]
mm: shrink the page frame to !-rt size

He below is a boot-tested hack to shrink the page frame size back to
normal.

Should be a net win since there should be many less PTE-pages than
page-frames.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm: make vmstat -rt aware
Ingo Molnar [Fri, 3 Jul 2009 13:30:13 +0000 (08:30 -0500)]
mm: make vmstat -rt aware

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm-vmstat-fix-the-irq-lock-asymetry.patch
Thomas Gleixner [Wed, 22 Jun 2011 18:47:08 +0000 (20:47 +0200)]
mm-vmstat-fix-the-irq-lock-asymetry.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm: convert swap to percpu locked
Ingo Molnar [Fri, 3 Jul 2009 13:29:51 +0000 (08:29 -0500)]
mm: convert swap to percpu locked

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm-page-alloc-fix.patch
Thomas Gleixner [Thu, 21 Jul 2011 14:47:49 +0000 (16:47 +0200)]
mm-page-alloc-fix.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm: page_alloc reduce lock sections further
Peter Zijlstra [Fri, 3 Jul 2009 13:44:37 +0000 (08:44 -0500)]
mm: page_alloc reduce lock sections further

Split out the pages which are to be freed into a separate list and
call free_pages_bulk() outside of the percpu page allocator locks.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm: page_alloc: rt-friendly per-cpu pages
Ingo Molnar [Fri, 3 Jul 2009 13:29:37 +0000 (08:29 -0500)]
mm: page_alloc: rt-friendly per-cpu pages

rt-friendly per-cpu pages: convert the irqs-off per-cpu locking
method into a preemptible, explicit-per-cpu-locks method.

Contains fixes from:
 Peter Zijlstra <a.p.zijlstra@chello.nl>
 Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm: More lock breaks in slab.c
Peter Zijlstra [Fri, 3 Jul 2009 13:44:43 +0000 (08:44 -0500)]
mm: More lock breaks in slab.c

Handle __free_pages outside of the locked regions. This reduces the
lock contention on the percpu slab locks in -rt significantly.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoslab: Fix __do_drain to use the right array cache
Steven Rostedt [Wed, 12 Oct 2011 03:56:23 +0000 (23:56 -0400)]
slab: Fix __do_drain to use the right array cache

The array cache in __do_drain() was using the cpu_cache_get() function
which uses smp_processor_id() to get the proper array. On mainline, this
is fine as __do_drain() is called by for_each_cpu() which runs
__do_drain() on the CPU it is processing. In RT locks are used instead
and __do_drain() is only called from a single CPU. This can cause the
accounting to be off and trigger the following bug:

slab error in kmem_cache_destroy(): cache `nfs_write_data': Can't free all objects
Pid: 2905, comm: rmmod Not tainted 3.0.6-test-rt17+ #78
Call Trace:
 [<ffffffff810fb623>] kmem_cache_destroy+0xa0/0xdf
 [<ffffffffa03aaffb>] nfs_destroy_writepagecache+0x49/0x4e [nfs]
 [<ffffffffa03c0fe0>] exit_nfs_fs+0xe/0x46 [nfs]
 [<ffffffff8107af09>] sys_delete_module+0x1ba/0x22c
 [<ffffffff8109429d>] ? audit_syscall_entry+0x11c/0x148
 [<ffffffff814b6442>] system_call_fastpath+0x16/0x1b

This can be easily triggered by a simple while loop:

# while :; do modprobe nfs; rmmod nfs; done

The proper function to use is cpu_cache_get_on_cpu(). It works for both
RT and non-RT as the non-RT passes in smp_processor_id() into
__do_drain().

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Cc: Clark Williams <clark@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1318391783.13262.11.camel@gandalf.stny.rr.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm-slab-wrap-functions.patch
Thomas Gleixner [Sat, 18 Jun 2011 17:44:43 +0000 (19:44 +0200)]
mm-slab-wrap-functions.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agocpu-rt-variants.patch
Thomas Gleixner [Fri, 17 Jun 2011 13:42:38 +0000 (15:42 +0200)]
cpu-rt-variants.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agort-local-irq-lock.patch
Thomas Gleixner [Mon, 20 Jun 2011 07:03:47 +0000 (09:03 +0200)]
rt-local-irq-lock.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agolocal-var.patch
Thomas Gleixner [Fri, 24 Jun 2011 16:40:37 +0000 (18:40 +0200)]
local-var.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoUSB: Fix the mouse problem when copying large amounts of data
Wu Zhangjin [Mon, 4 Jan 2010 03:33:02 +0000 (11:33 +0800)]
USB: Fix the mouse problem when copying large amounts of data

When copying large amounts of data between the USB storage devices and
the hard disk, the USB mouse will not work, this patch fixes it.

[NOTE: This problem have been found in the Loongson family machines, not
sure whether it is producible on other platforms]

Signed-off-by: Hu Hongbing <huhb@lemote.com>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
12 years agodrivers: net: gianfar: Make RT aware
Thomas Gleixner [Thu, 1 Apr 2010 18:20:57 +0000 (20:20 +0200)]
drivers: net: gianfar: Make RT aware

The adjust_link() disables interrupts before taking the queue
locks. On RT those locks are converted to "sleeping" locks and
therefor the local_irq_save/restore must be converted to
local_irq_save/restore_nort.

Reported-by: Xianghua Xiao <xiaoxianghua@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Xianghua Xiao <xiaoxianghua@gmail.com>
12 years agodrivers/net: vortex fix locking issues
Steven Rostedt [Fri, 3 Jul 2009 13:30:00 +0000 (08:30 -0500)]
drivers/net: vortex fix locking issues

Argh, cut and paste wasn't enough...

Use this patch instead.  It needs an irq disable.  But, believe it or not,
on SMP this is actually better.  If the irq is shared (as it is in Mark's
case), we don't stop the irq of other devices from being handled on
another CPU (unfortunately for Mark, he pinned all interrupts to one CPU).

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
 drivers/net/ethernet/3com/3c59x.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agodrivers/net: fix livelock issues
Thomas Gleixner [Sat, 20 Jun 2009 09:36:54 +0000 (11:36 +0200)]
drivers/net: fix livelock issues

Preempt-RT runs into a live lock issue with the NETDEV_TX_LOCKED micro
optimization. The reason is that the softirq thread is rescheduling
itself on that return value. Depending on priorities it starts to
monoplize the CPU and livelock on UP systems.

Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agogenirq-force-threading.patch
Thomas Gleixner [Sun, 3 Apr 2011 09:57:29 +0000 (11:57 +0200)]
genirq-force-threading.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agogenirq: disable irqpoll on -rt
Ingo Molnar [Fri, 3 Jul 2009 13:29:57 +0000 (08:29 -0500)]
genirq: disable irqpoll on -rt

Creates long latencies for no value

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agogenirq: Disable random call on preempt-rt
Thomas Gleixner [Tue, 21 Jul 2009 14:07:37 +0000 (16:07 +0200)]
genirq: Disable random call on preempt-rt

The random call introduces high latencies and is almost
unused. Disable it for -rt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agogenirq: Disable DEBUG_SHIRQ for rt
Thomas Gleixner [Fri, 18 Mar 2011 09:22:04 +0000 (10:22 +0100)]
genirq: Disable DEBUG_SHIRQ for rt

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agofs: jbd/jbd2: Make state lock and journal head lock rt safe
Thomas Gleixner [Fri, 18 Mar 2011 09:11:25 +0000 (10:11 +0100)]
fs: jbd/jbd2: Make state lock and journal head lock rt safe

bit_spin_locks break under RT.

Based on a previous patch from Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
--

 include/linux/buffer_head.h |   10 ++++++++++
 include/linux/jbd_common.h  |   24 ++++++++++++++++++++++++
 2 files changed, 34 insertions(+)

12 years agobuffer_head: Replace bh_uptodate_lock for -rt
Thomas Gleixner [Fri, 18 Mar 2011 08:18:52 +0000 (09:18 +0100)]
buffer_head: Replace bh_uptodate_lock for -rt

Wrap the bit_spin_lock calls into a separate inline and add the RT
replacements with a real spinlock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm: Replace cgroup_page bit spinlock
Thomas Gleixner [Wed, 19 Aug 2009 07:56:42 +0000 (09:56 +0200)]
mm: Replace cgroup_page bit spinlock

Bit spinlocks are not working on RT. Replace them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agonet-wireless-warn-nort.patch
Thomas Gleixner [Thu, 21 Jul 2011 19:05:33 +0000 (21:05 +0200)]
net-wireless-warn-nort.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agosignal-fix-up-rcu-wreckage.patch
Thomas Gleixner [Fri, 22 Jul 2011 06:07:08 +0000 (08:07 +0200)]
signal-fix-up-rcu-wreckage.patch

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomm: scatterlist dont disable irqs on RT
Thomas Gleixner [Fri, 3 Jul 2009 13:44:34 +0000 (08:44 -0500)]
mm: scatterlist dont disable irqs on RT

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agotty: Do not disable interrupts in put_ldisc on -rt
Thomas Gleixner [Mon, 17 Aug 2009 17:49:19 +0000 (19:49 +0200)]
tty: Do not disable interrupts in put_ldisc on -rt

Fixes the following on PREEMPT_RT:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
in_atomic(): 0, irqs_disabled(): 1, pid: 9116, name: sshd
Pid: 9116, comm: sshd Not tainted 2.6.31-rc6-rt2 #6
Call Trace:
[<ffffffff81034a4f>] __might_sleep+0xec/0xee
[<ffffffff812fbc6d>] rt_spin_lock+0x34/0x75
[ffffffff81064a83>] atomic_dec_and_spin_lock+0x36/0x54
[<ffffffff811df7c7>] put_ldisc+0x57/0xa6
[<ffffffff811dfb87>] tty_ldisc_hangup+0xe7/0x19f
[<ffffffff811d9224>] do_tty_hangup+0xff/0x319
[<ffffffff811d9453>] tty_vhangup+0x15/0x17
[<ffffffff811e1263>] pty_close+0x127/0x12b
[<ffffffff811dac41>] tty_release_dev+0x1ad/0x4c0
....

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agousb: Use local_irq_*_nort() variants
Steven Rostedt [Fri, 3 Jul 2009 13:44:26 +0000 (08:44 -0500)]
usb: Use local_irq_*_nort() variants

[ tglx: Now that irqf_disabled is dead we should kill that ]

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agocore: Do not disable interrupts on RT in res_counter.c
Ingo Molnar [Fri, 3 Jul 2009 13:44:33 +0000 (08:44 -0500)]
core: Do not disable interrupts on RT in res_counter.c

Frederic Weisbecker reported this warning:

[   45.228562] BUG: sleeping function called from invalid context at kernel/rtmutex.c:683
[   45.228571] in_atomic(): 0, irqs_disabled(): 1, pid: 4290, name: ntpdate
[   45.228576] INFO: lockdep is turned off.
[   45.228580] irq event stamp: 0
[   45.228583] hardirqs last  enabled at (0): [<(null)>] (null)
[   45.228589] hardirqs last disabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
[   45.228602] softirqs last  enabled at (0): [<ffffffff8025449d>] copy_process+0x68d/0x1500
[   45.228609] softirqs last disabled at (0): [<(null)>] (null)
[   45.228617] Pid: 4290, comm: ntpdate Tainted: G        W  2.6.29-rc4-rt1-tip #1
[   45.228622] Call Trace:
[   45.228632]  [<ffffffff8027dfb0>] ? print_irqtrace_events+0xd0/0xe0
[   45.228639]  [<ffffffff8024cd73>] __might_sleep+0x113/0x130
[   45.228646]  [<ffffffff8077c811>] rt_spin_lock+0xa1/0xb0
[   45.228653]  [<ffffffff80296a3d>] res_counter_charge+0x5d/0x130
[   45.228660]  [<ffffffff802fb67f>] __mem_cgroup_try_charge+0x7f/0x180
[   45.228667]  [<ffffffff802fc407>] mem_cgroup_charge_common+0x57/0x90
[   45.228674]  [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
[   45.228680]  [<ffffffff802fc49d>] mem_cgroup_newpage_charge+0x5d/0x60
[   45.228688]  [<ffffffff802d94ce>] __do_fault+0x29e/0x4c0
[   45.228694]  [<ffffffff8077c843>] ? rt_spin_unlock+0x23/0x80
[   45.228700]  [<ffffffff802db8b5>] handle_mm_fault+0x205/0x890
[   45.228707]  [<ffffffff80212096>] ? ftrace_call+0x5/0x2b
[   45.228714]  [<ffffffff8023495e>] do_page_fault+0x11e/0x2a0
[   45.228720]  [<ffffffff8077e5a5>] page_fault+0x25/0x30
[   45.228727]  [<ffffffff8043e1ed>] ? __clear_user+0x3d/0x70
[   45.228733]  [<ffffffff8043e1d1>] ? __clear_user+0x21/0x70

The reason is the raw IRQ flag use of kernel/res_counter.c.

The irq flags tricks there seem a bit pointless: it cannot protect the
c->parent linkage because local_irq_save() is only per CPU.

So replace it with _nort(). This code needs a second look.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agocore: Do not disable interrupts on RT in kernel/users.c
Thomas Gleixner [Tue, 21 Jul 2009 21:06:05 +0000 (23:06 +0200)]
core: Do not disable interrupts on RT in kernel/users.c

Use the local_irq_*_nort variants to reduce latencies in RT. The code
is serialized by the locks. No need to disable interrupts.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoacpi: Do not disable interrupts on PREEMPT_RT
Thomas Gleixner [Tue, 21 Jul 2009 20:54:51 +0000 (22:54 +0200)]
acpi: Do not disable interrupts on PREEMPT_RT

Use the local_irq_*_nort() variants.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoinput: gameport: Do not disable interrupts on PREEMPT_RT
Ingo Molnar [Fri, 3 Jul 2009 13:30:16 +0000 (08:30 -0500)]
input: gameport: Do not disable interrupts on PREEMPT_RT

Use the _nort() primitives.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoinfiniband: Mellanox IB driver patch use _nort() primitives
Sven-Thorsten Dietrich [Fri, 3 Jul 2009 13:30:35 +0000 (08:30 -0500)]
infiniband: Mellanox IB driver patch use _nort() primitives

Fixes in_atomic stack-dump, when Mellanox module is loaded into the RT
Kernel.

Michael S. Tsirkin <mst@dev.mellanox.co.il> sayeth:
"Basically, if you just make spin_lock_irqsave (and spin_lock_irq) not disable
interrupts for non-raw spinlocks, I think all of infiniband will be fine without
changes."

Signed-off-by: Sven-Thorsten Dietrich <sven@thebigcorporation.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoide: Do not disable interrupts for PREEMPT-RT
Ingo Molnar [Fri, 3 Jul 2009 13:30:16 +0000 (08:30 -0500)]
ide: Do not disable interrupts for PREEMPT-RT

Use the local_irq_*_nort variants.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoata: Do not disable interrupts in ide code for preempt-rt
Steven Rostedt [Fri, 3 Jul 2009 13:44:29 +0000 (08:44 -0500)]
ata: Do not disable interrupts in ide code for preempt-rt

Use the local_irq_*_nort variants.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agopreempt: Provide preempt_*_(no)rt variants
Thomas Gleixner [Fri, 24 Jul 2009 10:38:56 +0000 (12:38 +0200)]
preempt: Provide preempt_*_(no)rt variants

RT needs a few preempt_disable/enable points which are not necessary
otherwise. Implement variants to avoid #ifdeffery.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agort: local_irq_* variants depending on RT/!RT
Thomas Gleixner [Tue, 21 Jul 2009 20:34:14 +0000 (22:34 +0200)]
rt: local_irq_* variants depending on RT/!RT

Add local_irq_*_(no)rt variant which are mainly used to break
interrupt disabled sections on PREEMPT_RT or to explicitely disable
interrupts on PREEMPT_RT.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agobug: BUG_ON/WARN_ON variants dependend on RT/!RT
Ingo Molnar [Fri, 3 Jul 2009 13:29:58 +0000 (08:29 -0500)]
bug: BUG_ON/WARN_ON variants dependend on RT/!RT

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>