From: Walter Banks on


David Brown wrote:

> But the author's main point is that when targeting gcc or other
> sophisticated compilers, you want to write clear and simple code and let
> the compiler do the work - there are many developers who want to get the
> fastest possible code, but don't understand how to work with the
> compiler to get it.

> "you want to write clear and simple code and let the compiler do the work"

There is another rational for this. Most compiler developers including
GCC developers spend most of their development time working on
simple code fragments. As a side effect code written in short clear
statements very often produce some of the nicest generated code.
This doesn't make the statement incorrect.

I ran into an extreme example of this a few years ago when a previously
unknown compiler vendor published some common benchmark results
that were exceptionally good. Turns out the compiler could only process
simple statements and the benchmarks were recoded to fit the compiler
limitations.


Regards,


Walter..
--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com





From: David Brown on
On 27/05/2010 12:22, Walter Banks wrote:
>
>
> David Brown wrote:
>
>> But the author's main point is that when targeting gcc or other
>> sophisticated compilers, you want to write clear and simple code and let
>> the compiler do the work - there are many developers who want to get the
>> fastest possible code, but don't understand how to work with the
>> compiler to get it.
>
>> "you want to write clear and simple code and let the compiler do the work"
>
> There is another rational for this. Most compiler developers including
> GCC developers spend most of their development time working on
> simple code fragments. As a side effect code written in short clear
> statements very often produce some of the nicest generated code.
> This doesn't make the statement incorrect.
>

That may be true to some extent - especially when looking at backends.
When you are trying to improve peepholes or register allocation on an
avr port, it's easier to test and study short sequences.

On the other hand, gcc is used for a huge variety of code and target
types. For example, gcc is almost invariably compiled with gcc. The
gcc developers therefore re-compile their compilers on a regular basis,
and will put at least some effort into the quality of object code
generated from the gcc source. And if you've ever looked at the source
code of gcc, there are areas that are about as far from "short clear
statements" as you could possibly get! This is one of the advantages of
gcc being a general and flexible compiler (you've already pointed out
the disadvantages compared to dedicated small processor compilers) - the
front end of gcc is tested with some of the biggest and most complex
source code around.

> I ran into an extreme example of this a few years ago when a previously
> unknown compiler vendor published some common benchmark results
> that were exceptionally good. Turns out the compiler could only process
> simple statements and the benchmarks were recoded to fit the compiler
> limitations.
>
>
> Regards,
>
>
> Walter..
> --
> Walter Banks
> Byte Craft Limited
> http://www.bytecraft.com
>
>
>
>
>

From: George Neuner on
On Thu, 27 May 2010 11:30:47 +0200, David Brown
<david(a)westcontrol.removethisbit.com> wrote:

>On 26/05/2010 21:32, George Neuner wrote:
>> On Wed, 26 May 2010 06:09:00 GMT, bastian42(a)yahoo.com (42Bastian
>> Schick) wrote:
>>
>>> On Tue, 25 May 2010 08:57:17 -0400, Walter Banks
>>> <walter(a)bytecraft.com> wrote:
>>>
>>>
>>>> Code motion and other simple optimizations leaves GCC's
>>>> source level debug information significantly broken forcing
>>>> many developers to debug applications with much of the
>>>> optimization off then recompile later with optimization on but
>>>> the code largely untested.
>>>
>>> I see not, why "broken debug information" is an excuse for not testing
>>> the final version. In an ideal world, there should be no need to debug
>>> the final version ;-)
>>>
>>> And if optimization breaks your code, it is likely your code was
>>> broken before ( e.g. missing 'volatile').
>>
>> That isn't true ... optimizations frequently don't play well together
>> and many combinations are impossible to reconcile on a given chip.
>>
>
>I have never seen a situation where correct C code failed because of
>higher optimisation except when exact timing is needed, or in the case
>of compiler errors. I have seen lots of code where the author has said
>it works without optimisations, but fails when they are enabled - in
>every case, it was the code that was at fault.

I agree that *most* problems are in the program. However, I have seen
correct code fail (and not just C code) under optimization and plenty
of genuine compiler bugs as well. The more advanced the optimization,
the more likely there will be bugs in the implementation.

Then too, people who don't work on compilers often don't appreciate
just how difficult it is to predict the effects of mixing
optimizations and the shear impossibility of testing all the possible
combinations. If you look inside any optimizing compiler, for each
optimization you'll see a myriad of conditions that prevent it from
being applied or which modify how it is applied. All it takes is for
one bad condition to be missed and some poor programmer is guaranteed
a headache.


>> GCC isn't a terribly good compiler and its high optimization modes are
>> notoriously unstable. A lot of perfectly good code is known to break
>> under 03 and even 02 is dangerous in certain situations.
>>
>
>A lot of code that is widely used is /not/ perfectly good.

Agreed. But there are also a number very carefully coded libraries
that are known to be incompatible with certain versions of GCC.


>Level "-O3" is seldom used in practice with gcc, but it is not because
>the compiler is unstable or generates bad code. It is simply that the
>additional optimisations used here often make the code bigger for very
>little, if any, speed gain. Even on large systems, bigger code means
>more cache misses and lower speed. So these optimisations only make
>sense for code that actually benefits from them.
>
>There /are/ optimisations in gcc that are considered unstable or
>experimental - there are vast numbers of flags that you can use if you
>want. But they don't get added to the -Ox sets unless they are known to
>be stable and reliable and to help for a wide range of code. Flags that
>are known to be experimental or problematic are marked as such in the
>documentation.
>
>Of course, there is no doubt that gcc has its bugs. And many of these
>occur with the rarer optimisation flags - that's code that has had less
>time to mature and had less testing than code that is used more often.
>The gcc bug trackers are open to the public, feel free to look through
>them or register any bugs you find.
>
>Now, how is this different from any other compiler?

It isn't. But this thread was about GCC in particular.

Every compiler ... like every program ... has bugs (possibly latent)
and every optimizing compiler has combinations of optimizations that
don't work under all circumstances. The problem is the circumstances
under which they don't work may not all be prohibited and are rarely
documented.

GCC may have a switch for every possible optimization - but how is the
programmer to know which combinations will cause problems given the
idiosyncrasies of a particular piece of code?

There may be a one liner in the manual saying that A doesn't work with
B, but what happens if you select both? If you're lucky you'll get a
warning, but sometimes one or both optimizations silently are not
applied and on rare occasions they may interact to produce bad code.


>I expect at this point somebody is going to say that commercial vendors
>have better testing routines than gcc, and therefore less bugs. There
>is no objective way to know about the different testing methodologies,
>or their effectiveness at finding bugs, so any arguments one way or the
>other are futile.

I don't find any particular fault with GCC's testing methodology. What
does concern me is the attempt to be all things to all people ... GCC
is a unified IR trying to support the differing semantics of several
source languages across dozens of disparate targets.

Others have tried to support multiple languages with a common backend
- IBM, DEC, Sun, Pr1me, to name a few - and historically it has not
worked as well as they would have liked ... even without the
difficulties of supporting many different targets.

Even today, there are compilers for many languages targeting JVM
and/or .NET, but if you look closely, you'll find that many implement
subset or derivative languages because the semantics of the "standard"
language are incompatible with the platform and the developer has
judged the features too costly to emulate. Of the full language
implementations available, about half are interpreters.

George
From: David Brown on
George Neuner wrote:
> On Thu, 27 May 2010 11:30:47 +0200, David Brown
> <david(a)westcontrol.removethisbit.com> wrote:
>
>> On 26/05/2010 21:32, George Neuner wrote:
>>> On Wed, 26 May 2010 06:09:00 GMT, bastian42(a)yahoo.com (42Bastian
>>> Schick) wrote:
>>>
>>>> On Tue, 25 May 2010 08:57:17 -0400, Walter Banks
>>>> <walter(a)bytecraft.com> wrote:
>>>>
>>>>
>>>>> Code motion and other simple optimizations leaves GCC's
>>>>> source level debug information significantly broken forcing
>>>>> many developers to debug applications with much of the
>>>>> optimization off then recompile later with optimization on but
>>>>> the code largely untested.
>>>> I see not, why "broken debug information" is an excuse for not testing
>>>> the final version. In an ideal world, there should be no need to debug
>>>> the final version ;-)
>>>>
>>>> And if optimization breaks your code, it is likely your code was
>>>> broken before ( e.g. missing 'volatile').
>>> That isn't true ... optimizations frequently don't play well together
>>> and many combinations are impossible to reconcile on a given chip.
>>>
>> I have never seen a situation where correct C code failed because of
>> higher optimisation except when exact timing is needed, or in the case
>> of compiler errors. I have seen lots of code where the author has said
>> it works without optimisations, but fails when they are enabled - in
>> every case, it was the code that was at fault.
>
> I agree that *most* problems are in the program. However, I have seen
> correct code fail (and not just C code) under optimization and plenty
> of genuine compiler bugs as well. The more advanced the optimization,
> the more likely there will be bugs in the implementation.
>
> Then too, people who don't work on compilers often don't appreciate
> just how difficult it is to predict the effects of mixing
> optimizations and the shear impossibility of testing all the possible
> combinations. If you look inside any optimizing compiler, for each
> optimization you'll see a myriad of conditions that prevent it from
> being applied or which modify how it is applied. All it takes is for
> one bad condition to be missed and some poor programmer is guaranteed
> a headache.
>

I agree with you here in that advanced optimisations are often a
hot-spot for bugs in the compiler.

But I think (or at least, I /like/ to think) for more risky
optimisations, compiler developers are more conservative about when they
are applied - either by being careful about deciding when the
optimisation can be applied, or by requiring an explicit flag to enable
it. In the case of gcc, there should be no risky optimisations enabled
by any -Ox flag - you have to give the -fxxx flag explicitly.

>
>>> GCC isn't a terribly good compiler and its high optimization modes are
>>> notoriously unstable. A lot of perfectly good code is known to break
>>> under 03 and even 02 is dangerous in certain situations.
>>>
>> A lot of code that is widely used is /not/ perfectly good.
>
> Agreed. But there are also a number very carefully coded libraries
> that are known to be incompatible with certain versions of GCC.
>

Yes, sometimes that's the case. It's not uncommon to have a minimum
version requirement - if the code takes advantage of newer ISO
standards, or newer gcc extensions, it won't work with older versions.
The other way round is less common but does happen. Sometimes it's for
banal reasons, such as new keywords in later C standards conflicting
with identifiers (though this can be avoided by being more careful with
flag settings in makefiles).

>
>> Level "-O3" is seldom used in practice with gcc, but it is not because
>> the compiler is unstable or generates bad code. It is simply that the
>> additional optimisations used here often make the code bigger for very
>> little, if any, speed gain. Even on large systems, bigger code means
>> more cache misses and lower speed. So these optimisations only make
>> sense for code that actually benefits from them.
>>
>> There /are/ optimisations in gcc that are considered unstable or
>> experimental - there are vast numbers of flags that you can use if you
>> want. But they don't get added to the -Ox sets unless they are known to
>> be stable and reliable and to help for a wide range of code. Flags that
>> are known to be experimental or problematic are marked as such in the
>> documentation.
>>
>> Of course, there is no doubt that gcc has its bugs. And many of these
>> occur with the rarer optimisation flags - that's code that has had less
>> time to mature and had less testing than code that is used more often.
>> The gcc bug trackers are open to the public, feel free to look through
>> them or register any bugs you find.
>>
>> Now, how is this different from any other compiler?
>
> It isn't. But this thread was about GCC in particular.
>
> Every compiler ... like every program ... has bugs (possibly latent)
> and every optimizing compiler has combinations of optimizations that
> don't work under all circumstances. The problem is the circumstances
> under which they don't work may not all be prohibited and are rarely
> documented.
>
> GCC may have a switch for every possible optimization - but how is the
> programmer to know which combinations will cause problems given the
> idiosyncrasies of a particular piece of code?
>
> There may be a one liner in the manual saying that A doesn't work with
> B, but what happens if you select both? If you're lucky you'll get a
> warning, but sometimes one or both optimizations silently are not
> applied and on rare occasions they may interact to produce bad code.
>

Basically, if you use additional optimisation switches, you are taking a
risk. If it was a well-tested and low risk optimisation that can be
expected to improve the generated code for a lot of code, then it would
be included in one of the -Ox levels. (The exception to this is for
switches that are known to work well, but break standards in some way -
-ffast-math is a good example. It gives you smaller and faster code,
but does not comply with IEEE standards.) So if you think an extra
optimisation or two will help your code, you can try it and see - but
expect to be more careful about testing.

>
>> I expect at this point somebody is going to say that commercial vendors
>> have better testing routines than gcc, and therefore less bugs. There
>> is no objective way to know about the different testing methodologies,
>> or their effectiveness at finding bugs, so any arguments one way or the
>> other are futile.
>
> I don't find any particular fault with GCC's testing methodology. What
> does concern me is the attempt to be all things to all people ... GCC
> is a unified IR trying to support the differing semantics of several
> source languages across dozens of disparate targets.
>

It's hard to do this well - and it's clear that it causes problems for
gcc for some backends. Many of the targets of gcc are in fact similar
in principle - 32-bit, RISC, lots of registers, load-store architecture.
But if you look at something like the avr port of gcc, you can see
that the code is often somewhat suboptimal - the front-end of gcc pushes
8-bit calculations up to 16-bit ints (as per the C standards), and the
backend has to work out when parts of the 16-bit double-register
operations can be omitted. It does a good job, but not perfect, and the
lost information means the register set is not fully utilised.

I think aiming for this sort of independence between stages of the
compiler is a good thing overall. You are never going to get the best
compiler for the awkward smaller cpus (like the 8051) using such a
general compiler. But by structuring the compiler in this way you get
to re-use the parts rather than re-inventing them for each new target.
And you get the flexibility of being able to choose the target
processor, the host processor, and the languages all fairly
independently (and even pick a different host on which to build the
compiler!).

The other big project, and gcc's main competitor, is llvm. It is even
stricter than gcc about separating the stages of the compiler, and more
flexible about using different front-ends or back-ends.

> Others have tried to support multiple languages with a common backend
> - IBM, DEC, Sun, Pr1me, to name a few - and historically it has not
> worked as well as they would have liked ... even without the
> difficulties of supporting many different targets.
>
> Even today, there are compilers for many languages targeting JVM
> and/or .NET, but if you look closely, you'll find that many implement
> subset or derivative languages because the semantics of the "standard"
> language are incompatible with the platform and the developer has
> judged the features too costly to emulate. Of the full language
> implementations available, about half are interpreters.
>
> George
From: Hans-Bernhard Bröker on
David Brown wrote:

> I have never seen a situation where correct C code failed because of
> higher optimisation except when exact timing is needed,

In that case, it wasn't correct C code to begin with. Relying on exact
timing of any code sequence written in C is a design error.
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: Simulation of ARM7TDMI-S
Next: Which controller to use?