From: Leigh Johnston on


"George Neuner" <gneuner2(a)comcast.net> wrote in message
news:rq1nq5tskd51cmnf585h1q2elo28euh2kn(a)4ax.com...
<snip>
>
> Not exactly. 'volatile' is necessary to force the compiler to
> actually emit store instructions, else optimization would elide the
> useless first assignment and simply set n = 6. Beyond that constant
> propagation and/or value tracking might also eliminate the remaining
> assignment and the variable altogether.
>
> As you noted, 'volatile' does not guarantee that an OoO CPU will
> execute the stores in program order ... for that you need to add a
> write fence between them. However, neither 'volatile' nor write fence
> guarantees that any written value will be flushed all the way to
> memory - depending on other factors - cache snooping by another
> CPU/core, cache write back policies and/or delays, the span to the
> next use of the variable, etc. - the value may only reach to some
> level of cache before the variable is referenced again. The value may
> never reach memory at all.
>
> OoO execution and cache behavior are the reasons 'volatile' doesn't
> work as intended for many systems even in single-threaded use with
> memory-mapped peripherals. A shared (atomically writable)
> communication channel in the case of interrupts or concurrent threads
> is actually a safer, more predictable use of 'volatile' because, in
> general, it does not require values to be written all the way to main
> memory.
>
>
>>It's a great article. Among other things, it talks about the
>>non-portability of a solution that relies solely on volatile. How is it
>>different from what I have said in my earlier post? Quoting:
>>
>>"Is volatile sufficient - absolutely not.
>>Portable - hardly.
>>Necessary in certain conditions - absolutely."
>
> I haven't seen the whole thread and I'm not sure of the post to which
> you are referring. I think you might not be giving enough thought to
> the way cache behavior can complicate the standard's simple memory
> model. But it's possible that you have considered this and simply
> have not explained yourself thoroughly enough for [me and others] to
> see it.
>
> 'volatile' is necessary for certain uses but is not sufficient for
> (al)most (all) uses. I would say that for expert uses, some are
> portable and some are not. For non-expert uses ... I would say that
> most uses contemplated by non-experts will be neither portable nor
> sound.
>

Whether or not the store that is guaranteed to be emitted by the compiler
due to the presence of volatile propagates to L1 cache, L2 cache or main
memory is irrelevant as far as volatile and multi-threading is concerned as
long as CPU caches remain coherent. You could argue that because of this
volatile is actually more useful for multi-threading than for its more
traditional use of performing memory mapped I/O with modern CPU
architectures. I will reiterate though that the advent of C++0x should
consign this use of volatile to history.

/Leigh


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Joshua Maurice on
On Mar 24, 11:20 pm, Andy Venikov <swojchelo...(a)gmail.com> wrote:
> Joshua Maurice wrote:
> > On Mar 21, 2:32 pm, Andy Venikov <swojchelo...(a)gmail.com> wrote:
> <snip>
>
> >> The standard places a requirement on conforming implementations that:
>
> >> 1.9.6
> >> The observable behavior of the abstract machine is its sequence of reads
> >> and writes to volatile data and calls to library I/O functions
>
> >> 1.9.7
> >> Accessing an object designated by a volatile lvalue (3.10), modifying an
> >> object, calling a library I/O function, or calling a function that does
> >> any of those operations are all side effects, which are changes in the
> >> state of the execution environment. Evaluation of an expression might
> >> produce side effects. At certain specified points in the execution
> >> sequence called sequence points, all side effects of previous
> >> evaluations shall be complete and no side effects of subsequent
> >> evaluations shall have taken place
>
> >> 1.9.11
> >> The least requirements on a conforming implementation are:
> >> � At sequence points, volatile objects are stable in the sense that
> >> previous evaluations are complete and
> >> subsequent evaluations have not yet occurred.
>
> >> That to me sounds like a complete enough requirement that compilers
> >> don't perform optimizations that produce "surprising" results in so far
> >> as observable behavior in an abstract (single-threaded) machine are
> >> concerned. This requirement happens to be very useful for multi-threaded
> >> programs that can augment volatile with hardware fences to produce
> >> meaningful results.
> > That is one interpretation. Unfortunately / fortunately (?), that
> > interpretation is not the prevailing interpretation. Thus far in this
> > thread, we have members of the C++ standards committee or its
> > affiliates explicitly disagreeing on the committee's website with that
> > interpretation (linked else-thread). The POSIX standard explicitly
> > disagrees with your interpretation (see google). The
> > comp.programming.threads FAQ explicitly disagrees with you several
> > times (linked else-thread). We have gcc docs and implementation
> > disagreeing with your interpretation (see google). We have an official
> > blog from intel, the biggest maker of chips in the world, and a major
> > compiler writer, explicitly disagreeing with your interpretation
> > (linked else-thread). We have experts in the C++ community explicitly
> > disagreeing with your interpretation.
>
> All the sources that you listed were saying that volatile isn't
> sufficient. And some went on as far as to say that it's "mostly"
> useless. That "mostly", however, covers an area that is real and I was
> talking about that area. None of them disagreed with what I said.
>
> Here's a brief example that I hope will put this issue to rest:
>
> volatile int n;
>
> n = 5;
> n = 6;
>
> volatile guarantees (note: no interpretation here, it's just what it
> says) that the compiler will issue two store instructions in the correct
> order (5 then 6). And that is a very useful quality for multi-threaded
> programs that chose not to use synchronization primitives like mutexes
> and such. Of course it doesn't mean that the processor executes them in
> that order, that's why we'd use memory fences. But to stop the
> compiler from messing around with these sequences, the volatile is
> necessary.

No, that is your interpretation, an overreaching interpretation.
Neither the C or C++ standard mentions "store instruction". The C++
standard talks about "accesses" of "stored values". It never talks
about a processor, assembly, or "store instructions" in the context of
volatile. In fact, the C standard, which the C++ standard incorporates
in large part (both technically and in spirit), specifically says that
volatile accesses and "visible aspects of the abstract machine" are
inherently implementation specific and implementation defined.

And your argument is still irrelevant. (Nearly) all compiler writers
and (nearly?) all compiler implementations disagree with this
interpretation, and what the compilers actually do is all that matters
at the end of the day. volatile has no place as a synchronization
construct in portable code. None.


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: James Kanze on
On Mar 25, 7:10 pm, George Neuner <gneun...(a)comcast.net> wrote:
> On Thu, 25 Mar 2010 00:20:43 CST, Andy Venikov

[...]
> As you noted, 'volatile' does not guarantee that an OoO CPU will
> execute the stores in program order ...

Arguably, the original intent was that it should. But it
doesn't, and of course, the ordering guarantee only applies to
variables actually declared volatile.

> for that you need to add a write fence between them. However,
> neither 'volatile' nor write fence guarantees that any written
> value will be flushed all the way to memory - depending on
> other factors - cache snooping by another CPU/core, cache
> write back policies and/or delays, the span to the next use of
> the variable, etc. - the value may only reach to some level of
> cache before the variable is referenced again. The value may
> never reach memory at all.

If that's the case, then the fence instruction is seriously
broken. The whole purpose of a fence instruction is to
guarantee that another CPU (with another thread) can see the
changes. (Of course, the other thread also needs a fence.)

> OoO execution and cache behavior are the reasons 'volatile'
> doesn't work as intended for many systems even in
> single-threaded use with memory-mapped peripherals.

The reason volatile doesn't work with memory-mapped peripherals
is because the compilers don't issue the necessary fence or
membar instruction, even if a variable is volatile.

--
James Kanze

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: James Kanze on
On Mar 25, 6:20 am, Andy Venikov <swojchelo...(a)gmail.com> wrote:
> Joshua Maurice wrote:
> > On Mar 21, 2:32 pm, Andy Venikov <swojchelo...(a)gmail.com> wrote:
> <snip>
> Here's a brief example that I hope will put this issue to rest:

It does, but not in the way you seem to think:-).

> volatile int n;

> n = 5;
> n = 6;

> volatile guarantees (note: no interpretation here, it's just
> what it says) that the compiler will issue two store
> instructions in the correct order (5 then 6).

Since we're splitting hairs, technically, that's not what the
standard says. What the standard says is that the accesses
implied by the assignment statements will occur in the given
order. For some implementation defined meaning of "access".
(I'll also add my usual complaint. In the standard,
"implementation defined" means that is must be documented. I've
yet to find such documentation, however, for any of the
compilers I've used.)

> And that is a very useful quality for multi-threaded programs
> that chose not to use synchronization primitives like mutexes
> and such.

No it's not. It's quite frequent, in fact, that in the above
scenario, the 5 never makes it to main memory, or even out of
the write pipeline.

> Of course it doesn't mean that the processor executes them in
> that order, that's why we'd use memory fences. But to stop the
> compiler from messing around with these sequences, the
> volatile is necessary.

Show me how you use the memory fences, and I'll show you why the
compiler can't move the accesses accross them. There's no way
of getting a memory fence in standard C++, so you've moved the
problem to additional guarantees by the implementation. An
implementation either understands all of the code between the
two assignments (including inline assembler, calls to functions
written in assembler, etc.), and will see the fence and behave
accordingly, or it doesn't (and most don't try), and so it
cannot move the assignments, since it must assume that the code
it doesn't see or understand accesses n.

[...]

Concerning volatile...
> Necessary in certain conditions - absolutely."

It is certainly necessary in certain conditions. Both the C
standard and Posix, for example, require it when a variable is
accessed from both the main program and a signal handler. And
if it worked as intended (not the case with the compilers I
know), it would be necessary for memory mapped IO (which on the
compilers I know requires assembler in order to guarantee
correct behavior). It's just never never necessary for
communications between threads.

--
James Kanze

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Herb Sutter on

Please remember this: Standard ISO C/C++ volatile is useless for
multithreaded programming. No argument otherwise holds water; at best
the code may appear to work on some compilers/platforms, including all
attempted counterexamples I've seen on this thread.

On Thu, 25 Mar 2010 00:20:43 CST, Andy Venikov
<swojchelowek(a)gmail.com> wrote:
>All the sources that you listed were saying that volatile isn't
>sufficient. And some went on as far as to say that it's "mostly"
>useless. That "mostly", however, covers an area that is real and I was
>talking about that area. None of them disagreed with what I said.
>
>Here's a brief example that I hope will put this issue to rest:
>
>volatile int n;
>
>n = 5;

Insert:
x = 42; // update some non-volatile variable

>n = 6;
>
>volatile guarantees (note: no interpretation here, it's just what it
>says) that the compiler will issue two store instructions in the correct
>order (5 then 6). And that is a very useful quality for multi-threaded
>programs that chose not to use synchronization primitives like mutexes
>and such.

No. The reason that can't use volatiles for synchronization is that
they aren't synchronized (QED). There are several issues, and you
immediately go on to state one of them (again, not the only one):

>Of course it doesn't mean that the processor executes them in
>that order, that's why we'd use memory fences.

It's not just processor execution, it's also propagation and
visibility at other threads/cores.

There are other reasons why volatile n is insufficient for
inter-thread communication even in this example. Consider this
question:

- What if another thread does "if( n == 6 ) assert( x == 42 );"?
Will the assertion always be true? (Hint: It must for a volatile write
to be usable to even just publish data from one thread to another.)

- What values of n could another thread see? (Hint: What values of n
could another thread _not _ see?)

Standard volatile is useless for multithreaded programming, and that's
okay because that's not what it's for. It is intended only for things
like hardware access -- and even for those purposes is deliberately
underspecified in the standard(s), and the C and C++ committees are
not going to "fix" volatile even for that use. On some
implementations, volatile may happen to have some nonstandard
semantics that happen to be useful for multithreaded programming
(notably on Visual C++ on x86/x64 where volatiles can be used for most
uses of atomic<> including DCL but _not_ including Dekker's), but
that's not what volatile for (and it was a mistake to try to add those
guarantees to volatile in VC++).

Herb


---
Herb Sutter (herbsutter.wordpress.com) (www.gotw.ca)

Convener, SC22/WG21 (C++) (www.gotw.ca/iso)
Architect, Visual C++ (www.gotw.ca/microsoft)

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]