Faster image rotation [General Programming]

Prev: #include "cpuid.os"
Next: aspect ratio algorithm needed.

From: Brett Davis on 18 Apr 2010 16:27

> > Programming hotshots have done so much damage.
>
> Damage?
> That is clean code that is easy to read and understand.
>
> > And they brag about it.
>
> Only one in a hundred programers know an optimizaton like that, for
> half of comp.arch to be that good says good things about comp.arch.
>
> > I watched some doing one-upsmanship while the earth was still being
> > created, and I decided I wanted nothing to do with it. I think I
> > showed good judgment (rare for me).
>
> I thought I was generous giving away top secrets that most everyone
> else hoards.
>
> If you remember the story about the programmer told to add a cheat
> to the blackjack program so the customer would always win...

The Story of Mel, a Real Programmer
http://www.cs.utah.edu/~elb/folklore/mel.html

> I am not him, my code is clear and readable.
>
> Brett ;)

From: nmm1 on 18 Apr 2010 17:27

In article <4BCB4C2A.8080601(a)patten-glew.net>,
Andy \"Krazy\" Glew <ag-news(a)patten-glew.net> wrote:
>On 4/18/2010 1:36 AM, nmm1(a)cam.ac.uk wrote:
>> As I have
>> posted before, I favour a heterogeneous design on-chip:
>>
>> Essentially uninteruptible, user-mode only, out-of-order CPUs
>> for applications etc.
>> Interuptible, system-mode capable, in-order CPUs for the kernel
>> and its daemons.
>
>This is almost opposite what I would expect.
>
>Out-of-order tends to benefit OS code more than many user
>codes. In-order coherent threading benefits manly fairly stupid codes
>that run in user space, like multimedia.
>
>I would guess that you are motivated by something like the following:
>
>System code tends to have unpredictable branches, which hurt many OOO
>machines.
>
>System code you may want to be able to respond to interrupts easily. I
>am guessing that you believe that OOO has worse interrupt latency. That
>is a misconception: OOO tends to have better interrupt latency, since
>they usually redirect to the interrupt handler at retirement. However,
>they lose more work.

No, not at all. You are thinking performance - I am thinking RAS.

Trying to get asynchronous and parallel code with a lot of subtle
interactions (which is the case with many kernels) to work at all
is hard; doing it with highly out-of-order CPUs is murder. Most
shared-memory parallel codes (kernel and other) have lots of race
conditions that don't show up because the synchronisation time is
short compared with the time between the critical events.

However, when one application hammers the CPU hard, that can cause
large delays for OTHER threads (including kernel ones). As I said
earlier, I have seen 5 seconds delay in memory consistency. The
result is that you get very low probability, load-dependent,
non-repeatable failures. Ugh.

Regards,
Nick Maclaren.

From: nmm1 on 18 Apr 2010 17:29

In article <3782bf12-b3f5-4003-94a9-0299859358ed(a)y17g2000yqd.googlegroups.com>,
MitchAlsup <MitchAlsup(a)aol.com> wrote:
>On Apr 18, 1:15=A0pm, "Andy \"Krazy\" Glew" <ag-n...(a)patten-glew.net>
>wrote:
>
>> System code tends to have unpredictable branches, which hurt many OOO mac=
>hines.
>
>I think it is easier to think that system codes have so much inherent
>serializations that the efforts applied in doing OoO are "for want"
>and that these great big OoO machines degrade down to just about the
>same performance as the absolutely in-order cousins.
>
>Its a far bigger issue than simple branch mispredictability. Pointer
>chasing into poorly cached data structures is rampant; "dangerous"
>instructions that are inherently serialized; and poor TLB translation
>success rates. Overall, there just is not that much ILP left in many
>of the paths through system codes.

That was the experience in the days of the System/370. User code
got a factor of two better ILP than system code.

Regards,
Nick Maclaren.

From: nmm1 on 18 Apr 2010 17:32

In article <8u3s97-9bt2.ln1(a)ntp.tmsw.no>,
Terje Mathisen <"terje.mathisen at tmsw.no"> wrote:
>nmm1(a)cam.ac.uk wrote:
>> Well, yes, but that's no different from any other choice. As I have
>> posted before, I favour a heterogeneous design on-chip:
>>
>> Essentially uninteruptible, user-mode only, out-of-order CPUs
>> for applications etc.
>> Interuptible, system-mode capable, in-order CPUs for the kernel
>> and its daemons.
>
>This forces the OS to effectively become a message-passing system, since
>every single os call would otherwise require a pair of migrations
>between the two types of cpus.
>
>I'm not saying this would be bad though, since actual data could still
>be passed as pointers...

Yup. In my view, interrupts are doubleplus ungood - message passing
is good.

And, of course, the memory could be shared at some fairly close
cache level.

Regards,
Nick Maclaren.

From: Robert Myers on 18 Apr 2010 20:05

Brett Davis wrote:

>
> The Story of Mel, a Real Programmer
> http://www.cs.utah.edu/~elb/folklore/mel.html
>
>> I am not him, my code is clear and readable.
>>

I've hacked machine code.

You probably know of the proof of the Poincare Conjecture and the
attempt to claim credit for it by filling in "missing" steps.

Grigory Perelman had a right to be annoyed. If what was obvious to him
was not obvious to others, that was not his failing.

The same rules, I claim, do not apply to programming. What you have
done, and why, should be obvious to anyone with enough competence to
read the syntax.

If it's not obvious through the code itself, then it should be obvious
from the comments.

Robert.

First | Prev | Next | Last
Pages: 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Prev: #include "cpuid.os"
Next: aspect ratio algorithm needed.