Prev: [PATCH V3] input: Default to only using PNP for i8042 probing on x86
Next: [PATCH 3/3] [watchdog] re-introduce support for nmi_watchdog, nosoftlockup
From: Linus Torvalds on 17 May 2010 17:50 On Mon, 17 May 2010, Ingo Molnar wrote: > > +#ifdef CONFIG_64BIT > +/* popcnt %rdi, %rax */ > +#define POPCNT ".byte 0xf3,0x48,0x0f,0xb8,0xc7" > +#define REG_IN "D" > +#define REG_OUT "a" .... > +/* > + * __sw_hweightXX are called from within the alternatives below > + * and callee-clobbered registers need to be taken care of. See > + * ARCH_HWEIGHT_CFLAGS in <arch/x86/Kconfig> for the respective > + * compiler switches. > + */ > +static inline unsigned int __arch_hweight32(unsigned int w) > +{ > + unsigned int res = 0; > + > + asm (ALTERNATIVE("call __sw_hweight32", POPCNT, X86_FEATURE_POPCNT) > + : "="REG_OUT (res) > + : REG_IN (w)); > + > + return res; > +} I do not believe this is correct. On x86-64, you are using a 64-bit instruction, but REG_IN (w) does _not_ guarantee that the register is zero in the high bits. Yes, yes, in practice it _probably_ is, because the register almost certainly got loaded with some kind of zero-extending mov instruction. But as far as I can tell, the code is buggy. You're telling gcc that you are using a 32-bit register, but you're actually counting bits in the full 64 bits. Am I missing something? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on 17 May 2010 18:10
On 05/17/2010 02:46 PM, Linus Torvalds wrote: > > > On Mon, 17 May 2010, Ingo Molnar wrote: >> >> +#ifdef CONFIG_64BIT >> +/* popcnt %rdi, %rax */ >> +#define POPCNT ".byte 0xf3,0x48,0x0f,0xb8,0xc7" >> +#define REG_IN "D" >> +#define REG_OUT "a" > ... >> +/* >> + * __sw_hweightXX are called from within the alternatives below >> + * and callee-clobbered registers need to be taken care of. See >> + * ARCH_HWEIGHT_CFLAGS in <arch/x86/Kconfig> for the respective >> + * compiler switches. >> + */ >> +static inline unsigned int __arch_hweight32(unsigned int w) >> +{ >> + unsigned int res = 0; >> + >> + asm (ALTERNATIVE("call __sw_hweight32", POPCNT, X86_FEATURE_POPCNT) >> + : "="REG_OUT (res) >> + : REG_IN (w)); >> + >> + return res; >> +} > > I do not believe this is correct. > > On x86-64, you are using a 64-bit instruction, but > > REG_IN (w) > > does _not_ guarantee that the register is zero in the high bits. > > Yes, yes, in practice it _probably_ is, because the register almost > certainly got loaded with some kind of zero-extending mov instruction. But > as far as I can tell, the code is buggy. You're telling gcc that you are > using a 32-bit register, but you're actually counting bits in the full 64 > bits. > > Am I missing something? > No, you're absolutely right; the input can be any value including sign-extended. The best fix for this is probably to use a 32-bit instruction in this case -- I will check in such a patch and update the pull branch. If that's okay with you we'll send you a new pull request in a bit. The embarrassing part is I think I spotted this problem at one point, but forgot about it in dealing with a conflict between this and another patchset. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |