Prev: 2.6.34 Northbridge Chipset Errors on HP Proliant 4 x Opteron in x86_64 mode
Next: x86: enlightenment for ticket spin locks - eliminate NOPs introduced by first patch
From: M. Vefa Bicakci on 18 Jul 2010 10:30 On 17/07/10 10:15 PM, Linus Torvalds wrote: > On Sat, Jul 17, 2010 at 11:58 AM, M. Vefa Bicakci > <bicave(a)superonline.com> wrote: >> >> The kernel with d8e0902806c0bd2ccc4f6a267ff52565a3ec933b reverted >> was able to hibernate/thaw at least 40 times in one go, while >> the one with your fix applied was able to hibernate/thaw at most >> 17 times (in two separate trials) after which it crashed during >> the next thaw. > > Ok. I do wonder if the bug is possibly something entirely different, > and the allocation patterns just happen to expose/hide it. Reverting > the original commit should be pretty darn close to applying my fix. > Any remaining issues would seem to be more about the actual bug in the > original code (racing on changing that mapping->gfp_mask witthout any > locking) than about anything else. > >> Is there anything I can do find out the correct flags to use >> in addition to GFP_HIGHUSER ? Can I do something like a bisection >> for the flags one by one starting from the pre 2.6.32.8 state? >> If you could outline a procedure to do this, I would be glad to >> follow it. > > You can try adding __GFP_RECLAIMABLE | __GFP_NOMEMALLOC to the set of > flags in i915_gem_object_get_pages(). That's what the old code had > (and then it played games with NORETRY|NOWARN). I've attached a patch > (UNTESTED! Maybe it won't compile). > > Now, I don't see why those flags would matter, but NOMEMALLOC in > particular does make a difference for memory allocation patterns under > low memory conditions, so maybe it could make a difference. > > And if it _does_ make a difference, it would be interesting to know > which of the two flags matter. So try both flags first, and see if > that gets you something reliable. And if it does, remove one of them > and try again - just to see _which_ flag it is that the i915 driver > would care about. That would hopefully give us a hint. Dear Linus, After hours of testing I came up with the following result: We need to have the __GFP_RECLAIMABLE flag in addition to GFP_HIGHUSER. First I tested a kernel with both flags added to your fix. I was able to get more than 60 hibernate/thaw cycles without any errors, so I thought that was good. Then I tried a kernel with __GFP_NOMEMALLOC, and I found out that this kernel wasn't very reliable. In the first trial run, I got a crash in the second thaw. (Magic Sys-Rq did work.) In the second trial run, I got a Xorg related kernel Oops in the 12th thaw. Therefore I concluded that having only __GFP_NOMEMALLOC in addition to GFP_HIGHUSER was not good enough. Finally, I tested a kernel with __GFP_RECLAIMABLE. For this one, I did two trial runs, each with 60 hibernate/thaw cycles. I had no problems during these runs, so I concluded that __GFP_RECLAIMABLE is the key flag to use in addition to GFP_HIGHUSER and __GFP_COLD. I think in a previous e-mail you were suggesting that __GFP_RECLAIMABLE could be optionally needed for a few technical reasons. To be honest, I have no idea why it looks like it is needed for proper operation. As always, it is great to report test results. Hopefully this time I did enough amount of tests. Regards, M. Vefa Bicakci -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 18 Jul 2010 13:10
On Sun, Jul 18, 2010 at 7:27 AM, M. Vefa Bicakci <bicave(a)superonline.com> wrote: > > After hours of testing I came up with the following result: We need > to have the __GFP_RECLAIMABLE flag in addition to GFP_HIGHUSER. Thanks for the extensive testing, and I'm committing the one-liner to add it, and cc'ing it to stable. I'm pretty certain that there is something overly fragile in the i915 driver that this flag makes so much of a difference, but at the same time I'm actually happy that it's that reclaimable flag, because at least that one was always the "conceptually makes sense" one. So I suspected it would be some low-memory issue and the flag that woudl turn out to matter would be the NOMEMALLOC one, but I'm happy to have been wrong. Adding __GFP_RECLAIMABLE is sane, although I really would like to understand why the i915 driver apparently cares so deeply about the allocation/freeing patterns. But whatever. Thanks again for being such a thorough tester, Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |