Prev: benet: Fix compile warnnings in drivers/net/benet/be_ethtool.c
Next: What are the goals for the architecture of an in-kernel IR system?
From: Scott Lurndal on 26 Mar 2010 14:00 On Fri, Mar 26, 2010 at 10:23:46AM -0700, Linus Torvalds wrote: > > > On Fri, 26 Mar 2010, David Howells wrote: > > > > fls(N), ffs(N) and fls64(N) can be optimised on x86/x86_64. Currently they > > perform checks against N being 0 before invoking the BSR/BSF instruction, or > > use a CMOV instruction afterwards. Either the check involves a conditional > > jump which we'd like to avoid, or a CMOV, which we'd also quite like to avoid. > > > > Instead, we can make use of the fact that BSR/BSF doesn't modify its output > > register if its input is 0. By preloading the output with -1 and incrementing > > the result, we achieve the desired result without the need for a conditional > > check. > > This is totally incorrect. > > Where did you find that "doesn't modify its output" thing? It's not true. > The truth is that the destination is undefined. Just read the dang Intel > documentation, it's very clearly stated right there. While this is true for the current (253666-031US) Intel documentation, the AMD documentation (rev 3.14) for the same instruction states that the destination register is unchanged (as opposed to Intel's undefined). I wonder if Intel's EM64 stuff makes this more deterministic, perhaps David's implementation would work for x86_64 only? scott -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 26 Mar 2010 14:10 On Fri, 26 Mar 2010, Ralf Baechle wrote: > > My trusty old 486 book [1] in the remarks about the BSF instruction: > > "The documentation on the 80386 and 80486 states that op1 is undefined if > op2 is 0. In reality the 80386 will leave the value in op1 unchanged. > The first versions of the 80486 will change op1 to an undefined value. > Later version again will leave it unchanged." > > [1] Die Intel Familie in German language, by Robert Hummel, 1992 Ok, that explains my memory of us having tried this, at least. But I do wonder if any of the people working for Intel could ask the CPU architects whether we could depend on the "don't write" for 64-bit mode. If AMD already documents the don't-touch semantics, and if Intel were to be ok with documenting it for their 64-bit capable CPU's, we wouldn't then need to rely on undefined behavior. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Matthew Wilcox on 26 Mar 2010 14:20 On Fri, Mar 26, 2010 at 11:03:09AM -0700, Linus Torvalds wrote: > On Fri, 26 Mar 2010, Ralf Baechle wrote: > > > > My trusty old 486 book [1] in the remarks about the BSF instruction: > > > > "The documentation on the 80386 and 80486 states that op1 is undefined if > > op2 is 0. In reality the 80386 will leave the value in op1 unchanged. > > The first versions of the 80486 will change op1 to an undefined value. > > Later version again will leave it unchanged." > > > > [1] Die Intel Familie in German language, by Robert Hummel, 1992 > > Ok, that explains my memory of us having tried this, at least. > > But I do wonder if any of the people working for Intel could ask the CPU > architects whether we could depend on the "don't write" for 64-bit mode. > If AMD already documents the don't-touch semantics, and if Intel were to > be ok with documenting it for their 64-bit capable CPU's, we wouldn't then > need to rely on undefined behavior. I'll drop one of them a note. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Matthew Wilcox on 6 Apr 2010 09:40 On Fri, Mar 26, 2010 at 11:03:09AM -0700, Linus Torvalds wrote: > On Fri, 26 Mar 2010, Ralf Baechle wrote: > > "The documentation on the 80386 and 80486 states that op1 is undefined if > > op2 is 0. In reality the 80386 will leave the value in op1 unchanged. > > The first versions of the 80486 will change op1 to an undefined value. > > Later version again will leave it unchanged." > > > > [1] Die Intel Familie in German language, by Robert Hummel, 1992 > > Ok, that explains my memory of us having tried this, at least. > > But I do wonder if any of the people working for Intel could ask the CPU > architects whether we could depend on the "don't write" for 64-bit mode. > If AMD already documents the don't-touch semantics, and if Intel were to > be ok with documenting it for their 64-bit capable CPU's, we wouldn't then > need to rely on undefined behavior. I don't know whether we can get it /documented/, but the architect I asked said "We'll never get away with reverting to the older behavior, so in essence the architecture is set to not overwrite." -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jamie Lokier on 6 Apr 2010 10:00
Linus Torvalds wrote: > On Fri, 26 Mar 2010, Scott Lurndal wrote: > > > > I wonder if Intel's EM64 stuff makes this more deterministic, perhaps > > David's implementation would work for x86_64 only? > > Limiting it to x86-64 would certainly remove all the worries about all the > historical x86 clones. > > I'd still worry about it for future Intel chips, though. I absolutely > _detest_ relying on undocumented features - it pretty much always ends up > biting you eventually. And conditional writeback is actually pretty nasty > from a microarchitectural standpoint. On the same subject of relying on undocumented features: /* If SMP and !X86_PPRO_FENCE. */ #define smp_rmb() barrier() I've seen documentation, links posted to lkml ages ago, which implies this is fine on 64-bit for both Intel and AMD. But it appears to be relying on undocumented behaviour on 32-bit... Are you sure it is ok? Has anyone from Intel/AMD ever confirmed it is ok? Has it been tested? Clones? -- Jamie -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |