Prev: [PATCH 07/12] MAINTAINERS: Update RICOH SMARTMEDIA/XD DRIVER file patterns
Next: classmate-laptop: fix for RFKILL=m, CMPC=y
From: Suresh Rajashekara on 9 Jun 2010 16:00 I have an application (running on 2.6.29-omap1) which puts an OMAP1 system to suspend aggressively. The system wakes up every 4 seconds and stays awake for about 35 milliseconds and sleeps again for another 4 seconds. This design is to save power on a battery operated device. This aggressive suspend resume action seems like creating an issue to other applications in the system waiting for some timeout to happen (especially an application which is waiting using the mq_timedreceive and is supposed to timeout every 30 seconds. It seems to wake up every 90 seconds). Seems like the timekeeping is not happening properly in side the kernel. If the suspend duration is changed from 4 second to 1 second, then things work somewhat better. On reducing it to 0.5 second (which was our earlier design on 2.6.16-rc3), the problem seems to disappear. Is this expected? Thanks in advance, Suresh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Gleixner on 9 Jun 2010 16:30 On Wed, 9 Jun 2010, Suresh Rajashekara wrote: > I have an application (running on 2.6.29-omap1) which puts an OMAP1 > system to suspend aggressively. The system wakes up every 4 seconds > and stays awake for about 35 milliseconds and sleeps again for another > 4 seconds. This design is to save power on a battery operated device. > > This aggressive suspend resume action seems like creating an issue to > other applications in the system waiting for some timeout to happen > (especially an application which is waiting using the mq_timedreceive > and is supposed to timeout every 30 seconds. It seems to wake up every > 90 seconds). Seems like the timekeeping is not happening properly in > side the kernel. > > If the suspend duration is changed from 4 second to 1 second, then > things work somewhat better. On reducing it to 0.5 second (which was > our earlier design on 2.6.16-rc3), the problem seems to disappear. > > Is this expected? Yes, that's caused by the fact that suspend (via sys/power/state ) freezes the kernel internal timers and the user space visible timers which are based on CLOCK_MONOTONIC or jiffies (like mq_timedreceive on your .29 kernel). Only CLOCK_REALTIME based timers are kept correct as we have to align to the wall clock time. The reason for this is, that otherwise almost all timers are expired when we resume and we get a thundering herd of apps and kernel facilities due to firing timeouts. Another problem is that jiffies can wrap around on 32 bit systems during a long suspend though I don't think that's a real world problem as it takes between 49 to 497 days of suspend depending on the HZ setting. SO for your usecase it would not matter. I'm more concerned about code getting surprised by firing timers as the kernel has this behaviour for a long time now. Though we could change that conditionally - the default would still be the freeze of jiffies and CLOCK_MONOTONIC for historical compability. There will be probably some accounting issues. uptime, cpu time of the suspend task and some others, but that needs to be found out. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Suresh Rajashekara on 10 Jun 2010 02:40 On Wed, Jun 9, 2010 at 1:22 PM, Thomas Gleixner <tglx(a)linutronix.de> wrote: > Though we could change that conditionally - the default would still be > the freeze of jiffies and CLOCK_MONOTONIC for historical compability. If I were to change it only for our implementation, and make all the user space timers use CLOCK_REALTIME, then could you please point me in a direction as to what part of the kernel I should be touching to make that change? Earlier we faced issue with time that the application sees. It wasn't getting updated when we suspend and resume the system (where as the time inside the kernel kept updating) and hence eventually would drift from the actual time. for eg, if I use this loop at the command prompt while date do echo mem > /sys/power/state done then the date command always displayed the same time, but the prints from the kernel (I was using the printk time information) was advancing as expected. I found a patch at https://patchwork.kernel.org/patch/50070/ Though this fixed the application time update issue, there are lot of timers in the application which is still not working right. Could anyone please point in some direction to find the solution? Thanks, Suresh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: john stultz on 10 Jun 2010 16:00 On Wed, 2010-06-09 at 23:34 -0700, Suresh Rajashekara wrote: > On Wed, Jun 9, 2010 at 1:22 PM, Thomas Gleixner <tglx(a)linutronix.de> wrote: > > Though we could change that conditionally - the default would still be > > the freeze of jiffies and CLOCK_MONOTONIC for historical compability. > > If I were to change it only for our implementation, and make all the > user space timers use CLOCK_REALTIME, then could you please point me > in a direction as to what part of the kernel I should be touching to > make that change? I think Thomas was suggesting that you consider creating a option for where CLOCK_MONOTONIC included total_sleep_time. In that case the *hack* (and this is a hack, we'll need some more thoughtful discussion before anything like it could make it upstream) would be in timekeeping_resume() to comment out the lines that update wall_to_monotonic and total_sleep_time. It would be interesting to hear if that hack works for you, and we can try to come up with a better way to think about how to accommodate both views of how to account time over suspend. Thomas, might this call for a new posix clock_id, CLOCK_BOOTTIME (ie: CLOCK_MONOTONIC + total_sleep_time) or something that userland could use to set timers on? thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Petazzoni on 11 Jun 2010 03:30
Hello Suresh, On Wed, 9 Jun 2010 12:50:39 -0700 Suresh Rajashekara <suresh.raj+linuxomap(a)gmail.com> wrote: > I have an application (running on 2.6.29-omap1) which puts an OMAP1 > system to suspend aggressively. The system wakes up every 4 seconds > and stays awake for about 35 milliseconds and sleeps again for another > 4 seconds. This design is to save power on a battery operated device. > > This aggressive suspend resume action seems like creating an issue to > other applications in the system waiting for some timeout to happen > (especially an application which is waiting using the mq_timedreceive > and is supposed to timeout every 30 seconds. It seems to wake up every > 90 seconds). Seems like the timekeeping is not happening properly in > side the kernel. > > If the suspend duration is changed from 4 second to 1 second, then > things work somewhat better. On reducing it to 0.5 second (which was > our earlier design on 2.6.16-rc3), the problem seems to disappear. I've done a relatively similar thing on different CPU architecture: in the idle loop, when the CPU is going to be idle for a sufficiently long period of time, I power down the CPU completely. Before that, I've programmed a RTC (clocked at 32 khz) to wake-up the CPU a little bit *before* the expiration of the next timer. When the CPU wakes-up, I adjust the clocksource (in this case the CPU cycle counter) to compensate the time spent while the CPU was off, and I reprogram the clockevents to make sure that the timer will actually expire at the correct time, also by compensating the time during which the CPU was off (note: when the CPU is off, the cycle counter stops incrementing, and the timer used as clockevents stops decrementing). This way, the CLOCK_MONOTONIC time continues to go forward even when the CPU is off. The goal was to make the "CPU is off" case just another idle state of the system, which should just be as transparent to the life of the system as other idle states. So an application that uses a periodic timer of say, 30 milliseconds, will see its timer actually fired every 30 milliseconds even though the CPU goes off between each timer expiration (we've done measurements with a scope, and the timer rely expires every 30 milliseconds as expected). FWIW, we do not use the normal suspend/resume infrastructure for this, because it was way too slow (in the order of ~100ms). On the particular hardware we're using, it takes roughly ~1ms to go OFF, and ~2ms to completely wake-up, so we can very aggressively put the CPU in the OFF state. However, the way we're doing the "time compensation" is quite hackish, and it would be great to hear Thomas Gleixner's ideas on how this should be implemented properly at the clocksource/clock_event_device level. Sincerely, Thomas -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |