From: Peter Zijlstra on 1 Feb 2010 04:00 On Sat, 2010-01-30 at 16:47 -0800, Arjan van de Ven wrote: > On Sat, 30 Jan 2010 18:35:49 -0600 > Shawn Bohrer <shawn.bohrer(a)gmail.com> wrote: > \ > > > > I agree that we are currently depending on a bug in epoll. The epoll > > implementation currently rounds up to the next jiffie, so specifying a > > timeout of 1 ms really just wakes the process up at the next timer > > tick. I have a patch to fix epoll by converting it to use > > schedule_hrtimeout_range() that I'll gladly send, but I still need a > > way to achieve the same thing. > > it's not going to help you; your expectation is incorrect. > you CANNOT get 1000 iterations per second if you do > > <wait 1 msec> > <do a bunch of work> > <wait 1 msec> > etc in a loop > > the more accurate (read: not rounding down) the implementation, the > more not-1000 you will get, because to hit 1000 the two actions > > <wait 1 msec> > <do a bunch of work> > > combined are not allowed to take more than 1000 microseconds wallcock > time. Assuming "do a bunch of work" takes 100 microseconds, for you to > hit 1000 there would need to be 900 microseconds in a milliseconds... > and sadly physics don't work that way. > > (and that's even ignoring various OS, CPU wakeup and scheduler > contention overheads) Right, aside from that, CFS will only (potentially) delay your wakeup if there's someone else on the cpu at the moment of wakeup, and that's fully by design, you don't want to fix that, its bad for throughput. If you want deterministic wakeup latencies use a RT scheduling class (and kernel). Fwiw, your test proglet gives me: peter(a)laptop:~/tmp$ ./epoll Iterations Per Sec: 996.767947 Iterations Per Sec: 995.424135 Iterations Per Sec: 993.624936 and that's with full contemporary desktop bloat around. As it stand it appears you have at least two bugs in your application, you rely on broken epoll behaviour and you have incorrect assumptions on what the regular scheduler class will guarantee you (which is in fact nothing other than that your application will at one point in the future receive some service, per posix). Now CFS stives to gives you more guarantees than that, but they're soft. We try to schedule such that your application will receive a proportional amount of service to every other runnable task of the same nice level (and there's a weighted proportion between nice levels as well), furthermore we try to service each task at least once per nr_running*sysctl.kernel.sched_min_granularity_ns. If you see wakeup latencies an order of magnitude over that, we clearly messed up, but until that point we're doing ok-ish. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Shawn Bohrer on 1 Feb 2010 14:50 On Mon, Feb 01, 2010 at 09:51:30AM +0100, Peter Zijlstra wrote: > Right, aside from that, CFS will only (potentially) delay your wakeup if > there's someone else on the cpu at the moment of wakeup, and that's > fully by design, you don't want to fix that, its bad for throughput. > > If you want deterministic wakeup latencies use a RT scheduling class > (and kernel). I've confirmed that running my processes as SCHED_FIFO fixes the issue and allows me to achieve ~999.99 iterations per second. > As it stand it appears you have at least two bugs in your application, > you rely on broken epoll behaviour and you have incorrect assumptions on > what the regular scheduler class will guarantee you (which is in fact > nothing other than that your application will at one point in the future > receive some service, per posix). Interestingly I can also achieve ~999.99 iterations per second by using an infinite epoll timeout, and adding a 1 msec periodic timerfd handle to the epoll set while still using SCHED_OTHER. So it seems I have two solutions when using a new kernel so I'm satisfied. I'll see if I can clean up my patch to fix the broken epoll behavior and send it in. Thanks, Shawn -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
First
|
Prev
|
Pages: 1 2 3 Prev: [PATCH 3/3] net: macvtap driver Next: recti tude foods tuff alkal izes |