A tool that suggests optimized logic for a piece of code/module/function [Embedded]

Prev: regarding DAC in dsPIC33FJ128GP802
Next: CPU with SDRAM, USB, MAC, CAN, ADC and PWM ?

From: Walter Banks on 18 Jan 2010 16:12

David Brown wrote:

> One point that must be remembered about gcc is that it is targeted for a
> wide range of processors, multiple languages, and huge range of
> software. Walter's compilers are much more dedicated, and are mostly
> for smaller systems. It is perfectly reasonable for Walter's compilers
> to be based on requiring all the source code to be available at the
> point of compilation, and that all the code is compiled anew each time -
> the compiler then has the best possible view of the problem.

To clear up a misconception. We can compile with all source code
available but we it doesn't need to be. We can also compile each
module to obj and then link. In all cases the code generation has
a full application view.

We have written 28 compilers some for small systems some for
large systems. The processors we have targeted have varied widely
with natural data sizes of 8,9,12,13,16,24 and 32 bits. We have
targeted both heterogeneous and homogenous multiprocessor
execution platforms.

Our real skill is compiling to processors with unusual architectures or
processors with limited resources.

Regards,

--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com

From: Albert van der Horst on 21 Jan 2010 08:39

In article <qvn1l5942u1b2n532ekm8evcoo6dnbhgmk(a)4ax.com>,
Jon Kirwan <jonk(a)infinitefactors.org> wrote:
<SNIP>
>
>But cripes! Some time ago, I engaged in a very interesting
>process exploring various compilers' results upon compiling a
>very simple (we are talking maybe 10-12 lines of code here?)
>and well-known algorithm for calculating the GCD of two
>integers. The creative juices of several were applied, with
>the result of more than one excellent topological reformation
>that, had a compiler observed it, would have resulted in the
>automatic generation of object code very close to the best a
>human can do on their own. (Once the slight reorganization
>was found, merely using traditional optimizations that have
>been implemented, not just known about, for two decades and
>more.)

(12 lines?)
I investigate the object code resulting from
gcd(a,b) { while ( (a%=b) && (b%=a) ) {}; return a+b; }
on x86 with gcc and it is close to the best that
can be done.
The main inefficiency here is that division results end
up in fixed registers on the x86. A clever human may
spare a MOV by a smart selection of registers.

Next DEC Alpha. There is no divide! I'm pretty sure the
optimal gcd algorithm on DEC alpha is subtracting and
shifting. There is no reasonable way to arrive there from
the above c-code.

I agree if your conclusion is that
compiler optimisation is a hard AI problem.

<SNIP>
>
>I suppose so.
>
>I was addressing myself to the idea of swapping in a quick
>sort for a bubble sort and similar thrusts, though. On that
>point, there are some earlier steps to take first.

What is possibly feasible is the compiler warning for possibly
O(n^2) passes in an algorithm (e.g. bubble sort).
But then it might generate a lot of false alarms.
A bad qsort may have O(N^2) behaviour, and it would be a smart
compiler indeed that can distinguish between a good and a
bad qsort, especially where it involves speculating about
what data might be used.

JIT compilers may be more effective, as they can react to
actual data.

>
>Jon

Groetjes Albert

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert(a)spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

From: David Brown on 21 Jan 2010 09:40

Albert van der Horst wrote:
> In article <qvn1l5942u1b2n532ekm8evcoo6dnbhgmk(a)4ax.com>,
> Jon Kirwan <jonk(a)infinitefactors.org> wrote:
> <SNIP>
>> But cripes! Some time ago, I engaged in a very interesting
>> process exploring various compilers' results upon compiling a
>> very simple (we are talking maybe 10-12 lines of code here?)
>> and well-known algorithm for calculating the GCD of two
>> integers. The creative juices of several were applied, with
>> the result of more than one excellent topological reformation
>> that, had a compiler observed it, would have resulted in the
>> automatic generation of object code very close to the best a
>> human can do on their own. (Once the slight reorganization
>> was found, merely using traditional optimizations that have
>> been implemented, not just known about, for two decades and
>> more.)
>
> (12 lines?)
> I investigate the object code resulting from
> gcd(a,b) { while ( (a%=b) && (b%=a) ) {}; return a+b; }
> on x86 with gcc and it is close to the best that
> can be done.
> The main inefficiency here is that division results end
> up in fixed registers on the x86. A clever human may
> spare a MOV by a smart selection of registers.
>
> Next DEC Alpha. There is no divide! I'm pretty sure the
> optimal gcd algorithm on DEC alpha is subtracting and
> shifting. There is no reasonable way to arrive there from
> the above c-code.
>
> I agree if your conclusion is that
> compiler optimisation is a hard AI problem.
>
> <SNIP>
>> I suppose so.
>>
>> I was addressing myself to the idea of swapping in a quick
>> sort for a bubble sort and similar thrusts, though. On that
>> point, there are some earlier steps to take first.
>
> What is possibly feasible is the compiler warning for possibly
> O(n^2) passes in an algorithm (e.g. bubble sort).
> But then it might generate a lot of false alarms.
> A bad qsort may have O(N^2) behaviour, and it would be a smart

Even a reasonable qsort has O(N^2) worst case behaviour - qsort is good
on /average/ behaviour, but poor on worst case (variants of qsort have
better worst-case behaviour, or less chance of hitting the worst case).

Of course, there are plenty of problems for which O(N^2) is a perfectly
acceptable efficiency - there is no way a compiler could guess whether
O(N^2) is good or not, even if it could figure out the efficiency in the
first place.

> compiler indeed that can distinguish between a good and a
> bad qsort, especially where it involves speculating about
> what data might be used.
>
> JIT compilers may be more effective, as they can react to
> actual data.
>

An alternative is profiling with actual data, and feeding that back to
the compiler.

>> Jon
>
> Groetjes Albert
>
> --

From: Jon Kirwan on 21 Jan 2010 13:08

On 21 Jan 2010 13:39:04 GMT, Albert van der Horst
<albert(a)spenarnc.xs4all.nl> wrote:

>In article <qvn1l5942u1b2n532ekm8evcoo6dnbhgmk(a)4ax.com>,
>Jon Kirwan <jonk(a)infinitefactors.org> wrote:
><SNIP>
>>
>>But cripes! Some time ago, I engaged in a very interesting
>>process exploring various compilers' results upon compiling a
>>very simple (we are talking maybe 10-12 lines of code here?)
>>and well-known algorithm for calculating the GCD of two
>>integers. The creative juices of several were applied, with
>>the result of more than one excellent topological reformation
>>that, had a compiler observed it, would have resulted in the
>>automatic generation of object code very close to the best a
>>human can do on their own. (Once the slight reorganization
>>was found, merely using traditional optimizations that have
>>been implemented, not just known about, for two decades and
>>more.)
>
>(12 lines?)
>I investigate the object code resulting from
>gcd(a,b) { while ( (a%=b) && (b%=a) ) {}; return a+b; }
>on x86 with gcc and it is close to the best that
>can be done.
>
>The main inefficiency here is that division results end
>up in fixed registers on the x86. A clever human may
>spare a MOV by a smart selection of registers.
>
>Next DEC Alpha. There is no divide! I'm pretty sure the
>optimal gcd algorithm on DEC alpha is subtracting and
>shifting. There is no reasonable way to arrive there from
>the above c-code.
>
>I agree if your conclusion is that
>compiler optimisation is a hard AI problem.

Your own investigation isn't enough, by the way. You need
creative individuals (more than one, for certain) who apply
themselves to the task, as well. Not just an examination you
make, yourself, to know one way or another how "close to the
best that can be done" happens. I learned this myself
through a wonderful discussion on gcd() that we had started
here some time ago. So you are not in a position to decide
this, on your own. It requires a group effort to know
better.

Your point with the Alpha (shared by the ARM7-TDMI core)
about the lack of a hardware divide functional unit is why I
chose earlier to use the gcd method using subtracts:

unsigned int gcd (unsigned int a, unsigned int b)
{
if (a == 0 && b == 0)
b= 1;
else if (b == 0)
b= a;
else if (a != 0)
while (a != b)
if (a < b)
b -= a;
else
a -= b;
return b;
}

It's realistic, in the sense that it deals with some special
conditions of its inputs as well as performing the essential
computation.

A group of us used quite a variety of c compilers and targets
and none of them were able to achieve the performance of hand
coded assembly. There is a topological change that is 'easy'
for a human to see that no compiler appears to handle. Now,
if you look at the above code itself and do NOT perform the
modification I have in mind, then _some_ c compilers get very
close to human abilities. However, that isn't the limit of
human creativity by any stretch.

><SNIP>
>>
>>I suppose so.
>>
>>I was addressing myself to the idea of swapping in a quick
>>sort for a bubble sort and similar thrusts, though. On that
>>point, there are some earlier steps to take first.
>
>What is possibly feasible is the compiler warning for possibly
>O(n^2) passes in an algorithm (e.g. bubble sort).
>But then it might generate a lot of false alarms.
>A bad qsort may have O(N^2) behaviour, and it would be a smart
>compiler indeed that can distinguish between a good and a
>bad qsort, especially where it involves speculating about
>what data might be used.
>
>JIT compilers may be more effective, as they can react to
>actual data.

By adding analysis time to even try and decide which course
to take, I suppose. The cost of which may be higher than the
process intended. I'm not particularly hopeful in the
general case, but I can imagine specific cases where this
would be useful.

Jon

>>
>>Jon
>
>Groetjes Albert
>
>--

From: Richard Bos on 22 Jan 2010 09:43

dj3vande(a)csclub.uwaterloo.ca.invalid wrote:

> Nick Keighley <nick_keighley_nospam(a)hotmail.com> wrote:
> >On 13 Jan, 16:43, dj3va...(a)csclub.uwaterloo.ca.invalid wrote:
> >> Unix people call this a "shell".
> >
> >I'm guessing you're trying to be funny/ironic. But in case you aren't,
> >Unix has dozens of stranglely incompatible Command Line Interfaces
> >that Unix people call "shells". None of them are word processors.
>
> Right.
> But all of them have the property that I can get a word processor by
> typing the name of a word processor that's installed on the system.
>
> My point was that the "primitives" provided by a shell (the programs
> installed on the system) give a pretty good approximation to
> [Jongware]'s suggestion of "type 'word processor' and get a word
> processor".

That is an iffy comparison, because you need to install the word
processor separately from the shell before you can call it. In other
words, the programs you install are much more like third-party libraries
in a programming language, not like the primitives. In this respect, a
normal programming language like C is no different from a shell: from C,
you can also call the function process_words() - or if you wish,
system("wordprocessor");

Richard

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9
Prev: regarding DAC in dsPIC33FJ128GP802
Next: CPU with SDRAM, USB, MAC, CAN, ADC and PWM ?