From: David Brown on
Phil Carmody wrote:
> Rich Webb <bbew.ar(a)mapson.nozirev.ten> writes:
>> On Tue, 12 Jan 2010 13:34:11 -0800 (PST), David Schwartz
>> <davids(a)webmaster.com> wrote:
>>
>>> On Jan 12, 9:12 am, scattered <still.scatte...(a)gmail.com> wrote:
>>>
>>>> I don't see why it would be impossible. Optimizing compilers exist and
>>>> by using similar techniques it should be possible to write a compiler
>>>> that uses C as both the source and target language. How useful this
>>>> would be is another question. The resulting code would need to be
>>>> humanly readable if the tool would be of any use and the would
>>>> probably place severe restrictions on the sorts of optimizations that
>>>> can be done. I would be surprised if no work along these lines has
>>>> been done.
>>> That would be a useless tool, all it would do would be obfuscate. If
>>> the code contains optimizations that can be made by machine, what is
>>> the point of modifying the source code? Let the machine make the
>>> optimizations when the code is compiled and keep the source intact.
>> What he's looking for is not compiler optimizations but optimization of
>> the underlying algorithm. As somebody else mentioned, seeing a bubble
>> sort and deciding that a quick sort would be more appropriate.
>
> Unless it was only ever a handful of items, in which case an
> insertion sort might be more appropriate. Know your N (most
> important when N might be large, of course, but also when it's
> small). However, anything which brings about the nuking of
> bubblesort in any context is an improvement.
>

There are a large number of sort algorithms, with different pros and
cons in different circumstances. While quicksort is regarded as one of
the fastest general-purpose sorts, and bubblesort is regarded as one of
the poorest, it is easy to think of cases where bubblesort would beat
quicksort hands down (for example, if you are dealing with data that is
normally very close to sorted, and need a small code size). So don't be
too quick to leap to conclusions about sorting algorithms (or any other
sort of algorithms).

Incidentally, that also shows roughly why the OP's idea is uncomputable.

From: Walter Banks on


David Brown wrote:

> There are a large number of sort algorithms, with different pros and
> cons in different circumstances. While quicksort is regarded as one of
> the fastest general-purpose sorts, and bubblesort is regarded as one of
> the poorest, it is easy to think of cases where bubblesort would beat
> quicksort hands down (for example, if you are dealing with data that is
> normally very close to sorted, and need a small code size). So don't be
> too quick to leap to conclusions about sorting algorithms (or any other
> sort of algorithms).
>
> Incidentally, that also shows roughly why the OP's idea is uncomputable.

This is the second part of the problem. After the knowledge is extracted
from the code how do we make a rational decision about appropriate
changes if any.

An intermediate step might be objective programming languages focused
on goals and not implementation.

Regards,

--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com







From: David Brown on
Lorenzo Villari wrote:
> On Tue, 12 Jan 2010 23:44:08 +0100
> David Brown <david.brown(a)hesbynett.removethisbit.no> wrote:
>
>> Thirdly, rather than having a compiler try to guess what algorithm
>> you are trying to implement, it would be better to have more
>> algorithms implemented in the libraries and preferably the language
>> itself in a way that is natural for the programmer. For example, if
>> the language has support for real arrays with functions such as
>> sorting, the programmer can work with an array and let the compiler
>> figure out the best way to implement it.
>>
>
> 1) Coding a custom algorithm is not natural to the programmer?

Coding efficient sort algorithms (since this is the example mentioned)
is not natural to most programmers. "Low-level" algorithms for data
structures, data manipulation and calculations is not something that
many programmers are good at - it requires a more mathematical
background to do well. Application algorithms are a different matter,
of course.

> 2) What's a "real" array?
>

I mean something more than syntactic sugar for a pointer and a block of
memory (which is C's concept of arrays). Some languages have direct
support for higher level data structures such as arrays which know their
size and have useful methods, dictionaries, tress, etc. My point is,
when the language and compiler support these sorts of structures, the
compiler has a better chance to generate good code than when all it has
is low-level pointer access, and it has to guess what you are doing with
the array.

>> With C, the programmer will
>> either write their own bubble sort, or call qsort - with all its
>> overheads.
>
> Which overhead?
>

The overhead of calling the comparison function indirectly through a
pointer, and using run-time widths. Have you ever looked at the source
code for a typical standard C library qsort() routine? As with many
"generic" functions in C, it is far less efficient than if you right a
dedicated quicksort function with known sizes and comparison functions.

>> This is why programs written in interpreted languages
>> such as Python can often be faster than compiled C code for code
>> working with complex data structures, hash tables, sorting, list
>> manipulation, etc., Of course the C code /could/ be faster - but
>> people don't write optimal algorithms in most code. With the
>> libraries and language support in Python, you get very good
>> algorithms for free.
>>
>
> If you can stand the formatting of python... ouch! It's the same case

Python formatting is a matter of taste. At least there you are forced
to work to a reasonably standard formatting - C gives people freedom to
write the most hideous code (though it also lets you write quite nice code).

> of custom code vs standard code, or custom C container vs STL C++
> container. One asks himself\herself why the c++ people agreed to have a
> standard way of doing this and the C counterpart won't never agree...
>

C++ templates let you write something like a generic qsort() function
that will generate good target code.
From: Hans-Bernhard Bröker on
[X-post was rather excessive --- snipped]

Rich Webb wrote:
> What he's looking for is not compiler optimizations but optimization of
> the underlying algorithm.

Those two are not necessarily as different from each other as you make
them out to be. Nothing stops a compiler from completely reorganizing a
given function, as long as it's written in such a way that all effects
to the outside are tightly scoped. I.e. a compiler is already allowed
to detected a piece of code performs a sort and just call qsort() instead.

From: David Brown on
Hans-Bernhard Br�ker wrote:
> [X-post was rather excessive --- snipped]
>
> Rich Webb wrote:
>> What he's looking for is not compiler optimizations but optimization of
>> the underlying algorithm.
>
> Those two are not necessarily as different from each other as you make
> them out to be. Nothing stops a compiler from completely reorganizing a
> given function, as long as it's written in such a way that all effects
> to the outside are tightly scoped. I.e. a compiler is already allowed
> to detected a piece of code performs a sort and just call qsort() instead.
>

Nothing stops the compiler from doing this sort of thing in /theory/.
But /practice/ is a different matter. It is not often that the compiler
can find out what code (of a size large enough to be called an
"algorithm") actually does - and it has to be sure that any replacements
do at least as "good" a job. That means, in the words of the Wine
project, bug-for-bug compatibility. It's no easy job, especially with C
where you often can't (or don't, even if you /can/) properly express
what you want the code to do, but merely how you want it to do it.