From: James Kanze on 23 Mar 2010 08:08 On Mar 18, 10:32 pm, Joshua Maurice <joshuamaur...(a)gmail.com> wrote: > On Mar 17, 8:16 pm, "Leigh Johnston" <le...(a)i42.co.uk> wrote: [...] > I can't recall for the life of me where I read it, but I seem > to recall Andrei admitting that he misunderstand volatile, and > learned of the error of his ways, possibly in conjunction with > "C++ And The Perils Of Double-Checked Locking". It was in a discussion in this group, although I don't remember exactly when. The curious thing is that Andrei's techniques actually work, not because of any particular semantics of volatile, but because of the way it works in the type system; its use caused type errors (much like the one the original poster saw) if you attempted to circumvent the locking. The misunderstanding of volatile is apparently widespread. To the point that Microsoft actually proposed giving it the required semantics to the standards committee. That didn't go over very well, since it caused problems with the intended use of volatile. The Microsoft representative (Herb Sutter, as it happens) immediately withdrew the proposal, but I think they intend to implement these semantics in some future compiler, or perhaps have already implemented them in VC10. In defense of the Microsoft proposal: the proposed semantics do make sense if you restrict yourself to the world of application programs under general purpose OS's, like Windows or Unix. And the semantics actually implemented by volatile in most other compilers, like g++ or Sun CC, are totally useless, even in the contexts for which volatile was designed. At present, it's probably best to class volatile in the same category as export: none of the widely used compilers implement it to do anything useful. [...] > B- repeat my (perhaps unfounded) second hand information that > volatile in fact on most current implementations does not make > a global ordering of reads and writes. Independantly of what the standard says (and it does imply certain guarantees, such as would be necessary, for example, to use it for memory mapped IO), volatile has no practical semantics in most current compilers (Sun CC, g++, VC++, at least up through VC8.0). -- James Kanze -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Leigh Johnston on 23 Mar 2010 08:04 "Chris Vine" <chris(a)cvine--nospam--.freeserve.co.uk> wrote in message news:ceum77-go6.ln1(a)cvinex--nospam--x.freeserve.co.uk... > On Tue, 23 Mar 2010 08:05:28 CST > "Leigh Johnston" <leigh(a)i42.co.uk> wrote: > [snip] >> Sometimes you have to use common sense: >> >> thread A: >> finished = false; >> spawn_thread_B(); >> while(!finished) >> { >> /* do work */ >> } >> >> thread B: >> /* do work */ >> finished = true; >> >> If finished is not volatile and compiler optimizations are enabled >> thread A may loop forever. >> >> The behaviour of optimizing compilers in the real world can make >> volatile necessary to get correct behaviour in multi-threaded >> designs. You don't always have to use a memory barriers or a mutexes >> when performing an atomic read of some state shared by more than one >> thread. > > It is never "necessary" to use the volatile keyword "in the real world" > to get correct behaviour because of "the behaviour of optimising > compilers". If it is, then the compiler does not conform to the > particular standard you are writing to. For example, all compilers > intended for POSIX platforms which support pthreads have a > configuration flag (usually "-pthread") which causes the locking > primitives to act also as compiler barriers, and the compiler would be > non-conforming if it did not both provide this facility and honour it. > > Of course, there are circumstances when you can get away with the > volatile keyword, such as the rather contrived example you have given, > but in that case it is pretty well pointless because making the > variable volatile as opposed to using normal synchronisation objects > will not improve efficiency. In fact, it will hinder efficiency if > Thread A has run work before thread B, because thread A will depend on a > random future event on multi-processor systems, namely when the caches > happen to synchronise to achieve memory visibility, in order to proceed. > > Chris > It is not a contrived example, I have the following code in my codebase which is similar: ..... lock(); while (iSockets.empty() && is_running()) { unlock(); Sleep(100); if (!is_running()) return; lock(); } ..... is_running() is an inline member function which returns the value of a volatile member variable and shouldn't require a lock to query as it is atomic on the platform I target (x86). It makes sense for this platform and compiler (VC++) that I use volatile. Admittedly I could use an event/wait primitive instead but that doesn't make the above code wrong for the particular use-case in question. I agree that for other platforms and compilers this might be different. From what I understand and I agree with the advent of C++0x should see such volatiles disappear in favour of std::atomic<>. Not everyone in the real world is using C++0x as the standard has not even been published yet. /Leigh -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: James Kanze on 23 Mar 2010 11:07 On Mar 20, 7:12 am, Ulrich Eckhardt <eckha...(a)satorlaser.com> wrote: > Leigh Johnston wrote: > > "Joshua Maurice" <joshuamaur...(a)gmail.com> wrote in message > >news:900580c6-c55c-46ec-b5bc-1a9a2f0d76f5(a)w9g2000prb.googlegroups.com... > >>> Obviously the volatile keyword may not cause a memory > >>> barrier instruction to be emitted but this is a side > >>> issue. The combination of a memory barrier and volatile > >>> makes multi-threaded code work. > >> No. Memory barriers when properly used (without the > >> volatile keyword) are sufficient. > > No. Memory barriers are not sufficient if your optimizing > > compiler is caching the value in a register: the CPU is not > > aware that the register is referring to data being revealed > > by the memory barrier. > Actually, memory barriers in my understanding go both ways. > One is to tell the CPU that it must not cache/optimise/reorder > memory accesses. The other is to tell the compiler that it > must not do so either. Actually, as far as standard C++ is concerned, memory barriers don't exist, so it's difficult to talk about them. In practice, there are three ways to obtain them: -- Inline assembler. See your compiler manual with regards to what it guarantees; the standard makes no guarantees here. A conforming implementation can presumably do anything it wants with the inline assembler, including move it over an access to a volatile variable. From a QoI point of view, either 1) the compiler assumes nothing about the assembler, considers that it might access any accessible variable, and ensures that the actual semantics of the abstract machine correspond to those specified in the standard, 2) reads and interprets the inline assembler, and so recognizes a fence or a memory barrier, and behaves appropriately, or 3) provides some means of annotating the inline assember to tell the compiler what it can or cannot do. -- Call a function written in assembler. This really comes down to exactly the same as inline assembler, except that it's a lot more difficult for the compiler to implement the alternatives 2 or 3. (All compilers I know implement 1.) -- Call some predefined system API. In this case, the requirements are defined by the system API. (This is the solution used by Posix, Windows and C++0x.) -- James Kanze -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Joshua Maurice on 23 Mar 2010 11:08 On Mar 23, 7:05 am, "Leigh Johnston" <le...(a)i42.co.uk> wrote: > Sometimes you have to use common sense: > > thread A: > finished = false; > spawn_thread_B(); > while(!finished) > { > /* do work */ > > } > > thread B: > /* do work */ > finished = true; > > If finished is not volatile and compiler optimizations are enabled thread A > may loop forever. > > The behaviour of optimizing compilers in the real world can make volatile > necessary to get correct behaviour in multi-threaded designs. You don't > always have to use a memory barriers or a mutexes when performing an atomic > read of some state shared by more than one thread. No. You must use proper synchronization to guarantee a "happens- before" relationship, and volatile does not do that portably. Without the proper synchronization, the write to a variable in one thread, even a volatile write, may never become visible to another thread, even by a volatile read, on some real world systems. "Common sense" would be to listen to the people who wrote the compilers, such as Intel and gcc, to listen to the writers of the standard who influence the compiler writers, such as the C++ standards committee and their website, to listen to well respected experts who have studied these things in far greater detail than you and I, to read old papers and correspondence to understand the intention of volatile (which does not include threading), etc. It is not "common sense" to blithely ignore all of this and read into an ambiguous definition in an unrelated standard to get your desired properties (C+ +03 standard does not mention threads so it's not the relevant standard to look at); it's actually quite unreasonable to do so. Let me put it like this. Either you're writing on a thread-aware compiler or you are not. On a thread-aware compiler, you can use the standardized threading library, which will probably look a lot like POSIX, WIN32, Java, and C++0x. It will include mutexes and condition variables (or some rough equivalent, stupid WIN32), and possibly atomic increments, atomic test and swap, etc. It will define a memory model roughly compatible with the rest and include a strong equivalent of Java's "happens-before" relationship. In which case, volatile has no use (for threading) because the compiler is aware of the abstractions and will honor them, including the optimizer. In the other case, when you're using threads on a not-threads-aware compiler, you're FUBAR. There are so many little things to get right to produce correct assembly for threads that if the compiler is not aware of it, even the most innocuous optimization, or even register allocation, may entirely break your code. volatile may produce the desired result, and it may not. This is entirely system dependent as you are not coding to any standard, and thus not portable by any reasonable definition of portable. Also note that your (incorrect) reading of the C and C++ standards makes no mention of a guarantee about reorderings between non-volatile and volatile, so if thread B in your example changed shared state, these writes may be moved after the write to "finished", so thread A could see the write to "finished" but not see the changes to the shared state, or perhaps only a random portion of the writes to the shared state, a inconsistent shared state, which is begging for a crash. So, you could fully volatile qualify all of the shared state, leading to a huge performance hit, or you could just use the standardized abstractions which are guaranteed to work, which will actually work, which will run much faster, and which are portable. There seems to persist this "romanticized" ideal of "volatile" as somehow telling the compiler to "shut up" and "just do it", a sentiment noted by Andrei and Scott in "C++ And The Perils Of Double- Checked Locking". Please, go read the paper and its cited sources. They explain it so much better than I could. I'll link to it again here: http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: James Kanze on 23 Mar 2010 11:16
On Mar 22, 11:22 pm, "Bo Persson" <b...(a)gmb.dk> wrote: > Leigh Johnston wrote: > > "Andy Venikov" <swojchelo...(a)gmail.com> wrote in message > >news:ho5s8u$52u$1(a)news.eternal-september.org... > >>> I still must ask, really? That would mean that all shared > >>> state must be volatile qualified, including internal class > >>> members for shared data. Wouldn't that big a huge > >>> performance hit when the compiler can't optimize any of > >>> that? Could you even use prebuilt classes (which usually > >>> don't have volatile overloads) in the shared data, like > >>> say std::string, std::vector, std::map, etc.? > >> Not at all! > >> Most multi-threading issues are solved with mutexes, > >> semaphores, conditional variables and such. All of these > >> are library calls. That means that using volatile in those > >> cases is not necessary. It's only when you get into more > >> esotheric parallel computing problems where you'd like to > >> avoid a heavy-handed approach of mutexes that you enter the > >> realm of volatile. In normal multi-threading solved with > >> regular means there is really no reason to use volatile. > > Esoteric? I would have thought independent correctly > > aligned (and therefore atomic) x86 variable reads > > (fundamental types) without the use of a mutex are not > > uncommon making volatile not uncommon also on that platform > > (on VC++) at least. I have exactly one volatile in my > > entire codebase and that is such a variable. From MSDN > > (VC++) docs: > > "The volatile keyword is a type qualifier used to declare > > that an object can be modified in the program by something > > such as the operating system, the hardware, or a > > concurrently executing thread." > > That doesn't seem esoteric to me! :) > The esoteric thing is that this is a compiler specific > extension, not something guaranteed by the language. Currently > there are no threads at all in C++. > Note that the largest part of the MSDN document is clearly > marked "Microsoft Specific". It is in that part the release > and aquire semantics are defined. Note too that at least through VC8.0, regardless of the documentation, VC++ didn't implement volatile in a way that would allow it to be used effectively for synchronization on a multithreaded Windows platform. For some of the more performing machines, you need a fence, or at least some use of the lock prefex, and VC++ didn't generate these. Microsoft has expressed its intent to implement these extended semantics for volatile, however. -- James Kanze -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] |