Prev: [loongson] add basic fuloong2f support
Next: [PATCH 1/5] net/appletalk: push down BKL into a atalk_dgram_ops
From: Andreas Herrmann on 5 Nov 2009 12:40 The patches don't properly work here. (1) For instance I got following log entries when doing suspend/resume, doing CPU offline/online test and reloading the module: microcode: original microcode versions... microcode: CPU0-3: patch_level=0x1000065 platform microcode: firmware: requesting amd-ucode/microcode_amd.bin ... microcode: CPU0-1,3: patch_level=0x1000083 microcode: CPU2-3: patch_level=0x1000065 Microcode Update Driver: v2.00 <tigran(a)aivazian.fsnet.co.uk>, Peter Oruba The patch levels are: # for i in `seq 0 3`; do lsmsr -c $i PATCH_LEVEL; done PATCH_LEVEL = 0x0000000001000083 PATCH_LEVEL = 0x0000000001000083 PATCH_LEVEL = 0x0000000001000065 PATCH_LEVEL = 0x0000000001000065 (2) During suspend/resume the ucode is not updated: hadburg linux # for i in `seq 0 3`; do lsmsr -c $i PATCH_LEVEL; done PATCH_LEVEL = 0x0000000001000083 PATCH_LEVEL = 0x0000000001000083 PATCH_LEVEL = 0x0000000001000083 PATCH_LEVEL = 0x0000000001000083 hadburg linux # pm-suspend hadburg linux # for i in `seq 0 3`; do lsmsr -c $i PATCH_LEVEL; done PATCH_LEVEL = 0x0000000001000065 PATCH_LEVEL = 0x0000000001000065 PATCH_LEVEL = 0x0000000001000065 PATCH_LEVEL = 0x0000000001000065 That used to work w/o your patches. Didn't have time to look why this is now failing. You've changed mc_cpu_callback() -- most likely that is causing this regression. Regards, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andreas Herrmann on 6 Nov 2009 07:40 On Thu, Nov 05, 2009 at 07:40:53PM +0100, Dmitry Adamushko wrote: > 2009/11/5 Andreas Herrmann <herrmann.der.user(a)googlemail.com>: > > The patches don't properly work here. > > > > (1) For instance I got following log entries when doing > > � �suspend/resume, doing CPU offline/online test and reloading the > > � �module: > > To avoid possible misunderstandings, I'd like to clarify the output below. > > > �microcode: original microcode versions... > > �microcode: CPU0-3: patch_level=0x1000065 > > So this is the 1st time you have loaded a module. > > > �platform microcode: firmware: requesting amd-ucode/microcode_amd.bin > > �... > > �microcode: CPU0-1,3: patch_level=0x1000083 > > before or after loading a module? CPU2 is down, isn't it? No, no CPU was offline at this moment. They all were brought back online after some CPU hotplug and/or suspend/resume tests. > > �microcode: CPU2-3: patch_level=0x1000065 Both messages showed up after same ucode-update process. > same question as above. Same answer as above all CPUs are online. > Here, either CPUs 0 and 1 are down or have a > different version. Both above messages don't make sense taken together See, and that's the problem. > (CPU3 belongs to both sets) unless summarize_cpu_info() is utterly > broken. I didn't check that yet. > > �Microcode Update Driver: v2.00 <tigran(a)aivazian.fsnet.co.uk>, Peter Oruba > > > > The patch levels are: > > > > �# for i in `seq 0 3`; do lsmsr -c $i PATCH_LEVEL; done > > �PATCH_LEVEL � � � � �= 0x0000000001000083 > > �PATCH_LEVEL � � � � �= 0x0000000001000083 > > �PATCH_LEVEL � � � � �= 0x0000000001000065 > > �PATCH_LEVEL � � � � �= 0x0000000001000065 > > this is after your test has been stopped and all the CPUs are up, right? Yes. > > (2) During suspend/resume the ucode is not updated: > > > > �hadburg linux # for i in `seq 0 3`; do lsmsr -c $i PATCH_LEVEL; done > > �PATCH_LEVEL � � � � �= 0x0000000001000083 > > �PATCH_LEVEL � � � � �= 0x0000000001000083 > > �PATCH_LEVEL � � � � �= 0x0000000001000083 > > �PATCH_LEVEL � � � � �= 0x0000000001000083 > > �hadburg linux # pm-suspend > > �hadburg linux # for i in `seq 0 3`; do lsmsr -c $i PATCH_LEVEL; done > > �PATCH_LEVEL � � � � �= 0x0000000001000065 > > �PATCH_LEVEL � � � � �= 0x0000000001000065 > > �PATCH_LEVEL � � � � �= 0x0000000001000065 > > �PATCH_LEVEL � � � � �= 0x0000000001000065 > > > > > > That used to work w/o your patches. Didn't have time to look why this > > is now failing. You've changed mc_cpu_callback() -- most likely that > > is causing this regression. > > Hmm, cpu-event-callbacks seem to be working on my (Intel) setup. I > have enabled pr_debug messages and also did a little trick to allow > ucode of the same version to be loaded (my cpu is of the recent ucode > by itself) and I can see cpu-callback events for both resuming and > cpu-up cases. > > (firstly, upgraded with microcode_ctl as I only have a .dat file) > > suspend-resume > ... > [ 584.506371] microcode: CPU1 removed > [ 584.516018] microcode: CPU0 updated to revision 0x57, date = 2007-03-15 > [ 584.597326] microcode: CPU1 updated upon resume > [ 584.597562] microcode: CPU1 updated to revision 0x57, date = 2007-03-15 > [ 584.597565] microcode: CPU1 added > ... > > and now cpu1 : down -> up > > [ 1616.932249] microcode: CPU1 removed > [ 1633.942502] platform microcode: firmware: requesting intel-ucode/06-0f-02 > [ 1633.954638] microcode: data file intel-ucode/06-0f-02 load failed > [ 1633.954642] microcode: CPU1 added > > > as I understand, you don't see " platform microcode: firmware: > requesting intel-ucode" messages upon 'upping' a cpu, do you? Sure, no intel-ucode messages as I tested with AMD CPUs ;-) But otherwise no, no messages. > sure, my test is somewhat limited... anyway, first of all I'd like to > get a clear understanding of your logs. Thanks for yout test btw. :-)) I'll send you full logs asap. Regards, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andreas Herrmann on 6 Nov 2009 14:50 On Fri, Nov 06, 2009 at 01:56:31PM +0100, Dmitry Adamushko wrote: > 2009/11/6 Andreas Herrmann <herrmann.der.user(a)googlemail.com>: <snip> > >> (CPU3 belongs to both sets) unless summarize_cpu_info() is utterly > >> broken. > > > > I didn't check that yet. > > Yeah, this behavior is likely due to a missing cpumask_clear() in > summarize_cpu_info(). Yeah, that fixes the wrong messages. The other problem of not-updated CPU microcode after suspend/resume persists. > should be as follows: > > if (!alloc_cpumask_var(&cpulist, GFP_KERNEL)) > return; > > + cpumask_clear(cpulist); Better use zalloc_cpumask instead of alloc/clear. > >> sure, my test is somewhat limited... anyway, first of all I'd like to > >> get a clear understanding of your logs. Thanks for yout test btw. :-)) > > > > I'll send you full logs asap. > > Thanks. Maybe it's something about a particular sequence of actions > that triggers this behavior. Or was it reproducible with the very > first pm-suspend invocation after "modprobe microcode.ko"? The sequence is: 1. loading microcode.ko 2. setting cpu2 offline 3. setting cpu2 online 4. suspend (pm-suspend) 5. resume microcode of CPU2 is not updated: # for i in `seq 0 3`; do lsmsr -c $i PATCH_LEVEL; done PATCH_LEVEL = 0x0000000001000083 PATCH_LEVEL = 0x0000000001000083 PATCH_LEVEL = 0x0000000001000065 PATCH_LEVEL = 0x0000000001000083 dmesg attached. As I've said, that test used to pass with all CPUs updated to new ucode in the past (at least that I think so ;-( -- but in contrast to my previous mail this doesn't seem to be related to your patch. I tested latest mainline and the test fails as well ... seems that I need to do some debugging. Regards, Andreas PS1: You should remove the needless newline from the patch level string: static int version_snprintf(char *buf, int len, struct cpu_signature *csig) { - return snprintf(buf, len, "patch_level=0x%x\n", csig->rev); + return snprintf(buf, len, "patch_level=0x%x", csig->rev); } PS2: I plan to remove further needless messages from the amd ucode driver asap.
From: Andreas Herrmann on 11 Nov 2009 14:40 On Wed, Nov 11, 2009 at 05:07:22PM +0100, Dmitry Adamushko wrote: > Andreas, > > > any progress with this issue? Yes > You mentioned that the problem is also reproducible without my > patches, right? .... and yes. Fixed with http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-tip.git;a=commitdiff;h=9f15226e75583547aaf542c6be4bdac1060dd425 Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on 12 Nov 2009 06:40 -tip testing found the following bug - there's a _long_ boot delay of 58.6 seconds if the CPU family is not supported: [ 1.421761] calling microcode_init+0x0/0x137 @ 1 [ 1.426532] platform microcode: firmware: requesting amd-ucode/microcode_amd.bin [ 61.433126] microcode: failed to load file amd-ucode/microcode_amd.bin [ 61.439682] microcode: CPU0: AMD CPU family 0xf not supported [ 61.445441] microcode: CPU1: AMD CPU family 0xf not supported [ 61.451273] Microcode Update Driver: v2.00 <tigran(a)aivazian.fsnet.co.uk>, Peter Oruba [ 61.459116] initcall microcode_init+0x0/0x137 returned 0 after 58625622 usecs Where does this delay come from? Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Next
|
Last
Pages: 1 2 Prev: [loongson] add basic fuloong2f support Next: [PATCH 1/5] net/appletalk: push down BKL into a atalk_dgram_ops |