From: Mike Galbraith on
On Fri, 2010-02-05 at 19:33 +0300, malc wrote:
> Following test exhibits somewhat odd behaviour on, at least, 2.6.32.3
> (ppc) and 2.6.29.1 (x86_64), perhaps someone could explain why.

Expected behavior.

SCHED_BATCH tasks do not wakeup preempt, preemption is tick driven. The
writer therefore has time to fill the pipe/block, so reader can then
drain the pipe, leading to efficient data transfer.

SCHED_NORMAL tasks do preempt. Every write wakes a reader who's
vruntime (CPU utilization fairness yardstick) lags the writer enough to
warrant preemption, so one write translates to one preemption followed
by a read. The reader can't possibly catch up to the writer (being
synchronous) in either case, but the scheduler doesn't know that or
care, it simply tries to equalize the two. Since the writer's CPU
utilization stems entirely from tiny writes, that time is what goes
toward equalizing the reader. Result is the tiny I/Os the programmer
asked for, extreme low latency, and utterly horrid throughput.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/