From: markspace on
Lew wrote:

> You're mistaken. The assignment in 'methodA()' of 'data'
> /happens-before/ the write to 'set', which latter is synchronized. The
> read in 'methodB()' of 'data' /happens-after/ the read of 'set', which
> latter is synchronized on the same monitor. Therefore the read is
> guaranteed to see the write.


Yup, I'm mistaken.

I thought I'd read that the compiler was allowed to re-order writes and
reads around a synchronized block. "data" for example I thought might
be moved down below the synchronized block and so might not be read as
expected by methodB(). Turns out that is not the case. Synchronization
blocks provide the same memory guarantees as volatiles.

Brian Goetz calls this piggy-backing in JCIP. (He uses a volatile
variable as an example, which is why I was confused about synchronized
blocks.) And he says it's a dangerous thing to do because it's hard to
reason out these side effect correctly, and hard for maintenance
programmers to spot them as well. So to the OP and others, don't rely
on these side effects. Put all data in a synchronized block and you'll
be less likely to wonder what the heck is going on six months after your
wrote the code.

The only reason you would piggy-back memory synchronization like this is
for extreme speed optimization. The Future object in Brain Goetz's
example uses a single volatile variable and piggy-backs all its
synchronization on a single write/read of that variable. This is for
speed, because the Future object is heavily used in Executors. Don't to
the same until you are sure that you need to.

From: Lew on
markspace wrote:
> Brian Goetz calls this piggy-backing in JCIP.  (He uses a volatile
> variable as an example, which is why I was confused about synchronized
> blocks.)  And he says it's a dangerous thing to do because it's hard to
> reason out these side effect correctly, and hard for maintenance
> programmers to spot them as well.  So to the OP and others, don't rely
> on these side effects.  Put all data in a synchronized block and you'll
> be less likely to wonder what the heck is going on six months after your
> wrote the code.
>

Like everything else in programming, there are tradeoffs that mean
that that is sometimes, but not always good advice. The
countervailing advice is to minimize the extent of critical sections
to reduce contention.

If you use a syncrhonized block, it's because you need concurrent
execution. That much is painfully obvious, but the question of how
concurrent is less obvious. I've worked on systems that run on stages
of 64-core machines running many threads for high-volume
applications. Lock contention was a major bottleneck - really major -
where the original implementors had not fully grokked the implications
of highly concurrent systems.

The more work resides in a critical section, the higher the likelihood
of thread contention for that section. Lock contention has a
cascading effect - the more threads have to wait, the worse the wait
gets, causing more threads to have to wait, worsening the wait, ...
Compounding that, modern JVMs optimize the handling of uncontended
locks.

IMO the rule to "get in, get out quickly" for critical sections trumps
the questionable notion that we cannot expect Java programmers to
reason effectively about /happens-before/. If a programmer works with
concurrent programming, they absolutely must educate themselves about
the matter or they are negligent and irresponsible. Java 5
incorporated the notion of /happens-before/ and reworked the memory
model precisely in order to reduce the effort to understand the
consequences of concurrency. We should expect and demand that
programmers of concurrent systems understand it.

Programs don't exist for the convenience of programmers, but for our
clients. Although I do believe in making life easier for the
maintenance programmer, I don't believe in making it easier for the
ignorant or incompetent maintenance programmer. Reducing the extent
of critical sections benefits the program, and thus the client, not
the less-educated programmer, but the competent programmer can deal
with that.

As Albert Einstein said, we should make things as simple as possible,
but no simpler. Face it, concurrent programming is hard no matter
what we do. It's good to keep code straightforward, but it's
inadvisable to ruin concurrent performance because we're afraid
someone will be too lazy to think carefully.

> The only reason you would piggy-back memory synchronization like this is
> for extreme speed optimization.  The Future object in Brain Goetz's
>

That's not the only reason. The other reason is for non-extreme speed
optimization, or if you will, to prevent stupid de-optimization.

> example uses a single volatile variable and piggy-backs all its
> synchronization on a single write/read of that variable.  This is for
> speed, because the Future object is heavily used in Executors.  Don't to
> the same until you are sure that you need to.

Do piggy-back on synchronization. That's why /happens-before/
exists. Do not make critical sections larger than they need to be.

The rule of thumb I recommend: Keep critical sections to the minimum
length necessary to correctly coordinate concurrent access. That will
introduce tension against markspace's advice to simplify the
concurrent code by piling more into the critical section. Balancing
the two recommendations is a matter of art.

--
Lew
From: Peter Duniho on
markspace wrote:
> Lew wrote:
>
>> You're mistaken. The assignment in 'methodA()' of 'data'
>> /happens-before/ the write to 'set', which latter is synchronized.
>> The read in 'methodB()' of 'data' /happens-after/ the read of 'set',
>> which latter is synchronized on the same monitor. Therefore the read
>> is guaranteed to see the write.
>
> Yup, I'm mistaken.
>
> I thought I'd read that the compiler was allowed to re-order writes and
> reads around a synchronized block. "data" for example I thought might
> be moved down below the synchronized block and so might not be read as
> expected by methodB(). Turns out that is not the case. Synchronization
> blocks provide the same memory guarantees as volatiles.

Actually, synchronization blocks provide a stronger guarantee than
volatile. Volatile is only one-way. Writes in program order before a
volatile write cannot be moved to be after the volatile write, but reads
can be. Likewise, reads after a volatile read cannot be moved to before
the volatile read, but writes can be.

With a synchronization block, neither writes nor reads can be moved
around the synchronization block.

This stronger guarantee comes at a marginal increase in potential
performance cost, and not just because of the management of the monitor
itself, which is why "volatile" exists. With "volatile", the compiler
and hardware can still do some optimizations that would otherwise be
disallowed when a synchronization block is used.

> Brian Goetz calls this piggy-backing in JCIP. (He uses a volatile
> variable as an example, which is why I was confused about synchronized
> blocks.) And he says it's a dangerous thing to do because it's hard to
> reason out these side effect correctly, and hard for maintenance
> programmers to spot them as well. So to the OP and others, don't rely
> on these side effects. Put all data in a synchronized block and you'll
> be less likely to wonder what the heck is going on six months after your
> wrote the code.

Absolutely good advice. Especially for the new or only-occasional
concurrent programmer, just keep it simple. Use "synchronized" and the
higher-level locking types where appropriate.

If and when there's a performance issue, either hire a consultant who
has LOTS of concurrency experience to introduce fancier synchronization
techniques (e.g. lock-free, finer granularity, etc.), or be prepared to
spend a lot of time learning it yourself and being very careful with the
implementation.

It's easy enough to introduce bugs just doing things the basic way, with
"synchronized". You can imagine how easy bugs can be to make when you
are trying to rely on memory operation ordering rules based on memory
barriers created by "volatile".

Pete
From: markspace on
Lew wrote:

> The rule of thumb I recommend: Keep critical sections to the minimum
> length necessary to correctly coordinate concurrent access. That will
> introduce tension against markspace's advice to simplify the
> concurrent code by piling more into the critical section. Balancing
> the two recommendations is a matter of art.


I'd recommend making the code as simple to maintain as possible, until
you are sure you have 64-core machines, and you know which locks need to
be optimized.

Balancing the two recommendations is not a matter of art, it's a matter
of profiling running code. ;) Pre-mature optimization is the root of
all evil. Don't do it until you're sure you have to and have the code
profiles in hand to prove it.


From: Peter Duniho on
Lew wrote:
> [...]
>> example uses a single volatile variable and piggy-backs all its
>> synchronization on a single write/read of that variable. This is for
>> speed, because the Future object is heavily used in Executors. Don't to
>> the same until you are sure that you need to.
>
> Do piggy-back on synchronization. That's why /happens-before/
> exists. Do not make critical sections larger than they need to be.

It seems to me that there are two parts to your commentary. One part is
advice any Java programmer can apply: hold locks for as little time as
possible. This includes doing things like not making entire methods
synchronized (unless they are really short) and minimizing the shared
data interface, such that when synchronization is needed the thread can
grab the data it needs and then release the lock before proceeding to
perform a potentially time-consuming operation on the data.

Implicit in this advice is to not call out to "third-party" code (where
in this context, "third-party" is simply _any_ code not under the direct
control of the class that is using synchronization), such as calling
listeners or other interface callbacks, while holding a lock, because
you have no way to control how long that code will take.

The other part of your commentary involves going further, refining and
optimizing the code to take advantage of ordering guarantees provided by
the synchronization objects to introduce implicit synchronization into
the code.

That latter approach can in fact really help deal with performance
issues, especially in highly concurrent systems, but IMHO it's
definitely more of an "advanced" topic and I'd disagree it's something
that any Java programmer doing concurrent programming needs to take time
to understand.

I think it's useful to distinguish between the two techniques as
separate ideas.

A lot of the time, the concurrent programming the average Java
programmer is going to run into is simply to get a long-running
operation off the EDT so that the UI can remain responsive. Or at most,
to parallelize a relatively simple algorithm likely to execute on a
computer that has only two or four cores. In those scenarios,
contention isn't going to cause a huge problem on throughput and so
fine-tuning a concurrent algorithm by using lock-free techniques that
depend on the ordering guarantees is overkill (and likely to be done
wrong anyway).

> The rule of thumb I recommend: Keep critical sections to the minimum
> length necessary to correctly coordinate concurrent access. That will
> introduce tension against markspace's advice to simplify the
> concurrent code by piling more into the critical section. Balancing
> the two recommendations is a matter of art.

I didn't read "markspace"'s recommendation as suggesting to "pile more
into the critical section". Rather, he seems to me to simply be saying
that using the simpler "synchronized" construct to coordinate shared
data state changes is more appropriate for most Java programmers than to
try to maintain shared data outside of locks.

Pete