Prev: From Scripting to Scaling: Multi-core is challenging even the most battle-scared programmer
Next: Coarray Fortran
From: nmm1 on 18 Jul 2010 04:58 In article <b71929a1-57e6-4bbe-a1f1-380fa8579970(a)d8g2000yqf.googlegroups.com>, sturlamolden <sturlamolden(a)yahoo.no> wrote: >On 17 Jul, 03:08, gmail-unlp <ftine...(a)gmail.com> wrote: > >> 1) Making oarallel programs while forgetting (parallel) performance >> issues is a problem. And OpenMP helps, in some way, to forget >> important performance details such as pipelining, memory hierarchy, >> cache coherence, etc. However, if you remember you are parallelizing >> to improve performance I think you will not forget performance >> penalties and implicitly or explicitly optimize data traffic, for >> example. > >We should not forget that OpenMP is often used on "multi-core >processors". These are rather primitive parallel devices, they e.g. >have shared cache. Data traffic due to OpenMP can therefore be >minimal, because a dirty cache line need not be communicated. So if >the target is common desktop computers with quadcore Intel or AMD >CPUs, OpenMP can be perfectly fine. And this is the common desktop >computer these days. So for small scale parallelization on modern >desktop computers, OpenMP can be very good. But on large servers with >multiple processors, OpenMP can generate excessive data traffic and >scale very badly. While that is true, it's very partial and is very misleading. The days of a single-level cache are gone, and modern CPUs are going multi-level even internally - let alone on a multi-socket desktop! Once you do that, accessing the same dirty cache line from different CPUs becomes a problem, and many codes are not scalable even to those systems for that very reason. >P.S. It is a common misconception, particularly among computer science >scholars, that "shared memory" means no data traffic, and that threads >are better then processes for that matter. I.e. they can see that IPC >has a cost, and thus conclude that threads must be more efficient and >scale better. The lack of a native fork() on Windows has also thought >many of them to think in terms of threads rather than processes. The >use of MPI seem to be limited to scientists and engineers, the >majority of computer scientists don't even know what it is. >Concurrency to them means threads, and particularly C++ classes that >wrap threads. Most of the expect i/o bound programs that use threads >to be faster on multi-core computers, and they wonder why parallel >programming is so hard. That is very true. Regards, Nick Maclaren.
First
|
Prev
|
Pages: 1 2 3 4 5 6 7 Prev: From Scripting to Scaling: Multi-core is challenging even the most battle-scared programmer Next: Coarray Fortran |