Prev: nvidia controller failed command, possibly related to SMART selftest (2.6.32)
Next: powernow-k8: Core Performance Boost and effective frequency support
From: Eric Dumazet on 8 Apr 2010 04:00 Le jeudi 08 avril 2010 à 15:54 +0800, Zhang, Yanmin a écrit : > If there are 2 nodes in the machine, processes on node 0 will contact MCH of > node 1 to access memory of node 1. I suspect the MCH of node 1 might enter > a power-saving mode when all the cpus of node 1 are free. So the transactions > from MCH 1 to MCH 0 has a larger latency. > Hmm, thanks for the hint, I will investigate this. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Zhang, Yanmin on 8 Apr 2010 04:00 On Thu, 2010-04-08 at 09:00 +0200, Eric Dumazet wrote: > Le jeudi 08 avril 2010 � 07:39 +0200, Eric Dumazet a �crit : > > I suspect NUMA is completely out of order on current kernel, or my > > Nehalem machine NUMA support is a joke > > > > # numactl --hardware > > available: 2 nodes (0-1) > > node 0 size: 3071 MB > > node 0 free: 2637 MB > > node 1 size: 3062 MB > > node 1 free: 2909 MB > > > > > > # cat try.sh > > hackbench 50 process 5000 > > numactl --cpubind=0 --membind=0 hackbench 25 process 5000 >RES0 & > > numactl --cpubind=1 --membind=1 hackbench 25 process 5000 >RES1 & > > wait > > echo node0 results > > cat RES0 > > echo node1 results > > cat RES1 > > > > numactl --cpubind=0 --membind=1 hackbench 25 process 5000 >RES0_1 & > > numactl --cpubind=1 --membind=0 hackbench 25 process 5000 >RES1_0 & > > wait > > echo node0 on mem1 results > > cat RES0_1 > > echo node1 on mem0 results > > cat RES1_0 > > > > # ./try.sh > > Running with 50*40 (== 2000) tasks. > > Time: 16.865 > > node0 results > > Running with 25*40 (== 1000) tasks. > > Time: 16.767 > > node1 results > > Running with 25*40 (== 1000) tasks. > > Time: 16.564 > > node0 on mem1 results > > Running with 25*40 (== 1000) tasks. > > Time: 16.814 > > node1 on mem0 results > > Running with 25*40 (== 1000) tasks. > > Time: 16.896 > > If run individually, the tests results are more what we would expect > (slow), but if machine runs the two set of process concurrently, each > group runs much faster... If there are 2 nodes in the machine, processes on node 0 will contact MCH of node 1 to access memory of node 1. I suspect the MCH of node 1 might enter a power-saving mode when all the cpus of node 1 are free. So the transactions from MCH 1 to MCH 0 has a larger latency. > > > # numactl --cpubind=0 --membind=1 hackbench 25 process 5000 > Running with 25*40 (== 1000) tasks. > Time: 21.810 > > # numactl --cpubind=1 --membind=0 hackbench 25 process 5000 > Running with 25*40 (== 1000) tasks. > Time: 20.679 > > # numactl --cpubind=0 --membind=1 hackbench 25 process 5000 >RES0_1 & > [1] 9177 > # numactl --cpubind=1 --membind=0 hackbench 25 process 5000 >RES1_0 & > [2] 9196 > # wait > [1]- Done numactl --cpubind=0 --membind=1 hackbench > 25 process 5000 >RES0_1 > [2]+ Done numactl --cpubind=1 --membind=0 hackbench > 25 process 5000 >RES1_0 > # echo node0 on mem1 results > node0 on mem1 results > # cat RES0_1 > Running with 25*40 (== 1000) tasks. > Time: 13.818 > # echo node1 on mem0 results > node1 on mem0 results > # cat RES1_0 > Running with 25*40 (== 1000) tasks. > Time: 11.633 > > Oh well... > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric Dumazet on 8 Apr 2010 04:20 Le jeudi 08 avril 2010 à 09:54 +0200, Eric Dumazet a écrit : > Le jeudi 08 avril 2010 à 15:54 +0800, Zhang, Yanmin a écrit : > > > If there are 2 nodes in the machine, processes on node 0 will contact MCH of > > node 1 to access memory of node 1. I suspect the MCH of node 1 might enter > > a power-saving mode when all the cpus of node 1 are free. So the transactions > > from MCH 1 to MCH 0 has a larger latency. > > > > Hmm, thanks for the hint, I will investigate this. Oh well, perf timechart record & Instant crash Call Trace: perf_trace_sched_switch+0xd5/0x120 schedule+0x6b5/0x860 retint_careful+0xd/0x21 RIP ffffffff81010955 perf_arch_fetch_caller_regs+0x15/0x40 CR2: 00000000d21f1422 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Lameter on 8 Apr 2010 11:40 On Thu, 8 Apr 2010, Eric Dumazet wrote: > I suspect NUMA is completely out of order on current kernel, or my > Nehalem machine NUMA support is a joke > > # numactl --hardware > available: 2 nodes (0-1) > node 0 size: 3071 MB > node 0 free: 2637 MB > node 1 size: 3062 MB > node 1 free: 2909 MB How do the cpus map to the nodes? cpu 0 and 1 both on the same node? > # ./try.sh > Running with 50*40 (== 2000) tasks. > Time: 16.865 > node0 results > Running with 25*40 (== 1000) tasks. > Time: 16.767 > node1 results > Running with 25*40 (== 1000) tasks. > Time: 16.564 > node0 on mem1 results > Running with 25*40 (== 1000) tasks. > Time: 16.814 > node1 on mem0 results > Running with 25*40 (== 1000) tasks. > Time: 16.896 > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric Dumazet on 8 Apr 2010 12:00
Le jeudi 08 avril 2010 à 10:34 -0500, Christoph Lameter a écrit : > On Thu, 8 Apr 2010, Eric Dumazet wrote: > > > I suspect NUMA is completely out of order on current kernel, or my > > Nehalem machine NUMA support is a joke > > > > # numactl --hardware > > available: 2 nodes (0-1) > > node 0 size: 3071 MB > > node 0 free: 2637 MB > > node 1 size: 3062 MB > > node 1 free: 2909 MB > > How do the cpus map to the nodes? cpu 0 and 1 both on the same node? one socket maps to 0 2 4 6 8 10 12 14 (Node 0) one socket maps to 1 3 5 7 9 11 13 15 (Node 1) # numactl --cpubind=0 --membind=0 numactl --show policy: bind preferred node: 0 interleavemask: interleavenode: 0 nodebind: 0 membind: 0 cpubind: 1 3 5 7 9 11 13 15 1024 (strange 1024 report...) # numactl --cpubind=1 --membind=1 numactl --show policy: bind preferred node: 1 interleavemask: interleavenode: 0 nodebind: membind: 1 cpubind: 0 2 4 6 8 10 12 14 [ 0.161170] Booting Node 0, Processors #1 [ 0.248995] CPU 1 MCA banks CMCI:2 CMCI:3 CMCI:5 CMCI:6 SHD:8 [ 0.269177] Ok. [ 0.269453] Booting Node 1, Processors #2 [ 0.356965] CPU 2 MCA banks CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:8 [ 0.377207] Ok. [ 0.377485] Booting Node 0, Processors #3 [ 0.464935] CPU 3 MCA banks CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:8 [ 0.485065] Ok. [ 0.485217] Booting Node 1, Processors #4 [ 0.572906] CPU 4 MCA banks CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:8 [ 0.593044] Ok. .... grep "physical id" /proc/cpuinfo physical id : 1 physical id : 0 physical id : 1 physical id : 0 physical id : 1 physical id : 0 physical id : 1 physical id : 0 physical id : 1 physical id : 0 physical id : 1 physical id : 0 physical id : 1 physical id : 0 physical id : 1 physical id : 0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |