From: persres on 26 Jan 2010 18:59 Hi, I have some low level questions, all discussions in user space scenario. 1) How do I get architecture specific info - a) whether machine is SMP/ASMP/NUMA? b) Number of Caches and how they are chared across core? c) size of cacheline? 2) If I say _declspec(align(64)) will the data be cache aligned? It seems MSFT specific, any platform independent ways? 3) how do I allocate cache-aligned memory in heap?. I believe it is _aligned_malloc?. Is there any C++ call? Also, it is MSFT specific. Any platform independent call for aligned malloc? 4) Can we get page aligned memory in user space? can we ask for non- cached memory in user space heap? 5) Is cacheline size 64 bytes on all x86 machines? I hope I could get answers to some of those. Thanks very much. Cheers
From: Francois PIETTE on 27 Jan 2010 02:01 > I have some low level questions, all discussions in user space > scenario. > > 1) How do I get architecture specific info - > a) whether machine is SMP/ASMP/NUMA? > b) Number of Caches and how they are chared across core? > c) size of cacheline? System Information Development Kit is probably somthing useful for you. See http://www.cpuid-pro.com/sysinfo.php -- francois.piette(a)overbyte.be The author of the freeware multi-tier middleware MidWare The author of the freeware Internet Component Suite (ICS) http://www.overbyte.be
From: Tim Roberts on 27 Jan 2010 02:24 persres <persres(a)googlemail.com> wrote: > > I have some low level questions, all discussions in user space >scenario. > >1) How do I get architecture specific info - > a) whether machine is SMP/ASMP/NUMA? > b) Number of Caches and how they are chared across core? > c) size of cacheline? You can get some of this information from WMI: http://msdn.microsoft.com/en-us/library/aa394373.aspx and you can get some of this information from the CPUID instruction. >2) If I say _declspec(align(64)) will the data be cache aligned? It >seems MSFT specific, any platform independent ways? There is nothing platform-independent about caches. That's a very implementation-specific detail. Indeed, you can't assume that every processor even has a cache. >3) how do I allocate cache-aligned memory in heap?. I believe it is >_aligned_malloc?. Is there any C++ call? _aligned_malloc works just fine in C++. You can always use placement "new" with it: MyObject pobj = new (_aligned_malloc(sizeof MyObject)) MyObject; Note, however, that cache-aligning an object is not a terribly useful thing to do. > Also, it is MSFT specific. Any platform independent call for >aligned malloc? No. Again, caching is very implementation-specific detail. >4) Can we get page aligned memory in user space? You can pass any alignment you want to _aligned_malloc. Why would you want to? You can also call the VirtualAlloc API to allocate pages directly. >can we ask for non-cached memory in user space heap? No. That requires a kernel module. Why would you want it? >5) Is cacheline size 64 bytes on all x86 machines? Most, but not all. Also remember that current x86 processors include 3 different levels of cache. -- Tim Roberts, timr(a)probo.com Providenza & Boekelheide, Inc.
From: Chris M. Thomasson on 27 Jan 2010 04:42 "Tim Roberts" <timr(a)probo.com> wrote in message news:nuovl5phr0rqo0ts9c5v32if30pg9bhpic(a)4ax.com... > > persres <persres(a)googlemail.com> wrote: [...] >>3) how do I allocate cache-aligned memory in heap?. I believe it is >>_aligned_malloc?. Is there any C++ call? > > _aligned_malloc works just fine in C++. You can always use placement > "new" > with it: > > MyObject pobj = new (_aligned_malloc(sizeof MyObject)) MyObject; > > Note, however, that cache-aligning an object is not a terribly useful > thing > to do. I am curious as to what makes you say that? Padding critical data-structures to L2 cache lines and aligning them in memory on cache line boundaries can be __essential__ if you are interested in scalability and performance. [...]
From: persres on 27 Jan 2010 06:19
On 27 Jan, 09:42, "Chris M. Thomasson" <n...(a)spam.invalid> wrote: > "Tim Roberts" <t...(a)probo.com> wrote in message > > news:nuovl5phr0rqo0ts9c5v32if30pg9bhpic(a)4ax.com... > > > > > persres <pers...(a)googlemail.com> wrote: > [...] > >>3) how do I allocate cache-aligned memory in heap?. I believe it is > >>_aligned_malloc?. Is there any C++ call? > > > _aligned_malloc works just fine in C++. You can always use placement > > "new" > > with it: > Thanks for all the responses. > > MyObject pobj = new (_aligned_malloc(sizeof MyObject)) MyObject; > > > Note, however, that cache-aligning an object is not a terribly useful > > thing > > to do. > I have an array of 64 chars (bytes) all under the same lock. I think making sure they are in the same cache line can speed things up. > I am curious as to what makes you say that? Padding critical data-structures > to L2 cache lines and aligning them in memory on cache line boundaries can > be __essential__ if you are interested in scalability and performance. > > [...] |