From: Ulrich Eckhardt on 19 Mar 2010 16:12 Leigh Johnston wrote: > "Joshua Maurice" <joshuamaurice(a)gmail.com> wrote in message > news:900580c6-c55c-46ec-b5bc-1a9a2f0d76f5(a)w9g2000prb.googlegroups.com... >>> Obviously the volatile keyword may not cause a memory barrier >>> instruction to be emitted but this is a side issue. The combination >>> of a memory barrier and volatile makes multi-threaded code work. >> >> No. Memory barriers when properly used (without the volatile keyword) >> are sufficient. >> > > No. Memory barriers are not sufficient if your optimizing compiler is > caching the value in a register: the CPU is not aware that the register is > referring to data being revealed by the memory barrier. Actually, memory barriers in my understanding go both ways. One is to tell the CPU that it must not cache/optimise/reorder memory accesses. The other is to tell the compiler that it must not do so either. You can add the former via libraries to an existing compiler, but you can't do the latter without compiler support. That said, volatile often had the same effect as part 2 of the puzzle in legacy compilers, so smart hackers simply used that. > I never said volatile was a panacea but is something that is probably > required when using an optimizing compiler. If your C++ compiler has > memory barrier intrinsics it might be able to ensure volatile is not > required but this is also non-standard. If your compiler is aware of multithreading, you don't need volatile. If it isn't, even volatile doesn't guarantee you that it will work. At the very best, using volatile for the compiler and some other instructions for the CPU works as a workaround to get a not thread-aware compiler to play nice. Uli -- Sator Laser GmbH Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932 [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: red floyd on 19 Mar 2010 16:13 On Mar 19, 2:06 am, "Leigh Johnston" <le...(a)i42.co.uk> wrote: > That was my point, volatile whilst not a solution in itself is a "part" of a > solution for multi-threaded programming when using a C++ (current standard) > optimizing compiler: > > thread A: > finished = false; > spawn_thread_B(); > while(!finished) > { > /* do work */ > > } > > thread B: > /* do work */ > finished = true; > > If finished is not volatile and compiler optimizations are enabled thread A > may loop forever. Agreed. I've seen this in non-threaded code with memory-mapped I/O. -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Tony Jorgenson on 19 Mar 2010 17:22 > >You are incorrect to claim that volatile as defined by the current C++ > >standard has no use in multi-threaded programming. Whilst volatile does not > >guarantee atomicity nor memory ordering across multiple threads the fact > >that it prevents the compiler from caching vales in registers is useful and > >perhaps essential. You seem to be saying that volatile can be useful for multi-threaded code? (See questions below) > Yes, volatile does that. Unfortunately, that is necessary but not > sufficient for inter-thread communication to work correctly. Volatile > is for hardware access; std::atomic<T> is for multithreaded code > synchronized without mutexes. I understand that volatile does not guarantee that the order of memory writes performed by one thread are seen in the same order by another thread doing memory reads of the same locations. I do understand the need for memory barriers (mutexes, atomic variables, etc) to guarantee order, but there are still 2 questions that have never been completely answered, at least to my satisfaction, in all of the discussion I have read on this group (and the non moderated group) on these issues. First of all, I believe that volatile is supposed to guarantee the following: Volatile forces the compiler to generate code that performs actual memory reads and writes rather than caching values in processor registers. In other words, I believe that there is a one-to-one correspondence between volatile variable reads and writes in the source code and actual memory read and write instructions executed by the generated code. Is this correct? Question 1: My first question is with regard to using volatile instead of memory barriers in some restricted multi-threaded cases. If my above statements are correct, is it possible to use _only_ volatile with no memory barriers to signal between threads in a reliable way if only a single word (perhaps a single byte) is written by one thread and read by another? Question 1a: First of all, please correct me if I am wrong, but I believe volatile _must_always_ work as described above on any single core CPU. One CPU means one cache (or one hierarchy of caches) meaning one view of actual memory through the cache(s) that the CPU sees, regardless of which thread is running. Is this much correct for any CPU in existence? If not please mention a situation where this is not true (for single core). Question 1b: Secondly, the only way I could see this not working on a multi-core CPU, with individual caches for each core, is if a memory write performed by one CPU is allowed to never be updated in the caches of other CPU cores. Is this possible? Are there any multi-core CPUs that allow this? Doesn�t the MESI protocol guarantee that eventually memory cached in one CPU core is seen by all others? I know that there may be delays in the propagation from one CPU cache to the others, but doesn�t it eventually have to be propagated? Can it be delayed indefinitely due to activity in the cores involved? Question 2: My second question is with regard to if volatile is necessary for multi-threaded code in addition to memory barriers. I know that it has been stated that volatile is not necessary in this case, and I do believe this, but I don�t completely understand why. The issue as I see it is that using memory barriers, perhaps through use of mutex OS calls, does not in itself prevent the compiler from generating code that caches non-volatile variable writes in registers. I have heard it written in this group that posix, for example, supports additional guarantees that make mutex lock/unlock (for example) sufficient for correct inter-thread communication through memory without the use of volatile. I believe I read here once (from James Kanze I believe) that �volatile is neither sufficient nor necessary for proper multi- threaded code� (quote from memory). This seems to imply that posix is in cahoots with the compiler to make sure that this works. If you add mutex locks and unlocks (I know RAII, so please don�t derail my question) around some variable reads and writes, how do the mutex calls force the compiler to generate actual memory reads and writes in the generated code rather than register reads and writes? I understand that compilation optimization affects these issues, but if I optimize the hell out of my code, how do posix calls (or any other OS threading calls) force the compiler to do the right thing? My only conjecture is that this is just an accident of the fact that the compiler can�t really know what the mutex calls do and therefore the compiler must make sure that all globally accessible variables are pushed to memory (if they are in registers) in case _any_ called function might access them. Is this what makes it work? If not, then how do mutex call guarantee the compiler doesn�t cache data in registers, because this would surely make the mutexes worthless without volatile (which I know from experience that they are not). -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Andy Venikov on 19 Mar 2010 17:25 Andrei Alexandrescu wrote: <snip> > But by and large that's not sufficient to make sure things do work, and > they will never work portably. Here's a good article on the topic: > > http://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming/ > > > entitled eloquently "Volatile: Almost Useless for Multi-Threaded > Programming". And here's another entitled even stronger 'Why the > "volatile" type class should not be used': > > http://kernel.org/doc/Documentation/volatile-considered-harmful.txt > > The presence of the volatile qualifier in Loki is at best helpful but > never a guarantee of correctness. I recommend Scott and my article on > the topic, which was mentioned earlier in this thread: > > http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf > > Bottom line: using volatile with threads is almost always a red herring. > > > Andrei > Not in my wildest dreams would I think that I'd ever disagree with you, but here goes.... While it's true that there's a wild-spread misconception that volatile is a panacea for multi-threading issues and it's true that by itself it won't do anything to make multi-threaded programs safe, it's not correct to say that it's totally useless for threading issues as the "volatile-considered-harmful.txt" article is trying to implicate. In short, volatile is never sufficient, but often necessary to solve certain multi-threading problems. These problems (like writing lock-free algorithms) try to prevent execution statement re-ordering. Re-ordering can happen in two places: in the hardware, which is mitigated with memory fences; and in the compiler, which is mitigated with volatile. It's true that depending on the memory fence library that you use, the compiler won't move the code residing inside the fences to the outside, but it's not always the case. If you use raw asm statements for example (even if you add "volatile" to the asm keyword) your non-volatile variable is not guaranteed to stay inside the fenced region unless you declare it volatile. The advent of C++0x may well render it useless for multi-threading, but up until now it has been necessary. Thanks, Andy. -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Andy Venikov on 20 Mar 2010 17:09
Joshua Maurice wrote: >Leigh Johnston wrote: >> Obviously the volatile keyword may not cause a memory barrier instruction to >> be emitted but this is a side issue. The combination of a memory barrier >> and volatile makes multi-threaded code work. > > No. Memory barriers when properly used (without the volatile keyword) > are sufficient. Sorry Joshua, but I think it's a wrong, or at least an incomplete, statement. It all depends on how memory barriers/fences are implemented. In the same way that C++ standard doesn't talk about threads it doesn't talk about memory fences. If a memfence call is implemented as a library call, then yes, you will in essence get a compiler-level fence directive as none of the compilers I know of are allowed to move the code across a call to a library. But oftentimes memfences are implemented as macros that expand to inline assembly. If you don't use volatile then nothing will tell the compiler that it can't optimize the code and move the read/write across the macroized memfence. It is especially true on platforms that don't actually need hardware memfences (like x86) since in those cases calls to macro memfences will expand to nothing at all and then you will have nothing in your code that tells anything about a code-migration barrier. So is volatile sufficient - absolutely not. Portable? - hardly. Is it necessary in certain cases - absolutely. Thanks, Andy. -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |