Prev: LM3478 design gets insanely hot
Next: 89C51ED2
From: Nick Maclaren on 15 Aug 2008 12:56 In article <g84783$1ge$1(a)s1.news.oleane.net>, =?ISO-8859-15?Q?Jan_Vorbr=FCggen?= <Jan.Vorbrueggen(a)not-thomson.net> writes: |> |> > Oh, it was worse than that! After he had done the initial design |> > (which was reasonable, if not excellent), he was elbowed out, and |> > half of his design was thrown out to placate the god Benchmarketing. |> > |> > The aspect that I remember was that the GUI was brought back from |> > where he had exiled it to the 'kernel' - and, as we all know, the |> > GUIs are the source of all ills on modern systems :-( |> |> Yep - I think that was part of the 3.51 to 4.0 transition. As I |> understand it, the thing was just too resource-hungry for the |> hardware of the day to be marketable in that state. As I heard it, that was as much an excuse as a reason. What I heard was that it did perform like a dog, but that didn't distinguish it from any of the other major releases. And that problem was temporary. The other reason I heard was that the GUI (and other components?) were so repulsive that moving all of their privileged actions to the other side of an interface (ANY interface) was beyond the programmers. But they didn't want to admit that, so they mounted an internal propaganda campaign about the performance. However, you know what such stories are like. I neither believe nor disbelieve it. Regards, Nick Maclaren.
From: Dirk Bruere at NeoPax on 16 Aug 2008 22:30 Jan Panteltje wrote: > On a sunny day (Sun, 10 Aug 2008 17:05:31 +0000) it happened ChrisQ > <blackhole(a)devnull.com> wrote in <g7n75m$vi$1(a)aioe.org>: > >> Jan Panteltje wrote: >> >>> John Lennon: >>> >>> 'You know I am a dreamer' .... ' And I hope you join us someday' >>> >>> (well what I remember of it). You should REALLY try to program a Cell >>> processor some day. >>> >>> Dunno what you have against programmers, there are programmaers who >>> are amazingly clever with hardware resources. I dunno about NT and >>> MS, but IIRC MS plucked programmers from unis, and sort of >>> brainwashed them then.. the result we all know. >>> >>> >> That's just the problem - programmers have been so good at hiding the >> limitations of poorly designed hardware that the whole world thinks >> that hardware must be perfect and needs no attention other than making >> it go faster. >> >> If you look at some modern i/o device architectures, it's obvious the >> hardware engineers never gave a second thought about how the thing would >> be programmed efficiently... >> >> Chris (with embedded programmer hat on :-( > > Interesting. > For me, I have a hardware background, but also software, the two > came together with FPGA, when I wanted to implement DES as fast as possible. > I did wind up with just a bunch of gates and 1 clock cycle, so no program :-) > No loops (all unfolded in hardware). > So, you need to define some boundary between hardware resources (that one used a lot of gates), > and software resources, I think. Unless you blur the boundary further by using on-the-fly reprogrammable gate arrays. -- Dirk http://www.transcendence.me.uk/ - Transcendence UK http://www.theconsensus.org/ - A UK political party http://www.onetribe.me.uk/wordpress/?cat=5 - Our podcasts on weird stuff
From: Michel Hack on 18 Aug 2008 15:02 On Aug 13, 5:52 pm, "Wilco Dijkstra" <Wilco.removethisDijks...(a)ntlworld.com> wrote: > Btw Do you happen to know the reasoning behind signed left shifts being > undefined while right shifts are implementation defined? On some machines the high-order bit is shifted out, on others (e.g. S/ 370) it remains unchanged: 0x80000001 << 1 can become 0x80000002 and not 0x00000002 in a 32-bit register. The S/370 way parallels the common sign-propagation method of arithmetic right shifts: the sign does not change.
From: Wilco Dijkstra on 18 Aug 2008 17:46 "Nick Maclaren" <nmm1(a)cus.cam.ac.uk> wrote in message news:g81arl$6a0$1(a)gemini.csx.cam.ac.uk... > > In article <V4Uok.4995$Od3.4795(a)newsfe28.ams2>, > "Wilco Dijkstra" <Wilco.removethisDijkstra(a)ntlworld.com> writes: > |> > |> I'd certainly be interested in the document. My email is above, just make > |> the obvious edit. > > Sent. Thanks, I've received it, I'll have a look at it soon (it's big...). > |> > |> I bet that most code will compile and run without too much trouble. > |> > |> C doesn't allow that much variation in targets. And the variation it > |> > |> does allow (eg. one-complement) is not something sane CPU > |> > |> designers would consider nowadays. > |> > > |> > The mind boggles. Have you READ the C standard? > |> > |> More than that. I've implemented it. Have you? > > Some of it, in an extremely hostile environment. However, that is a lot > LESS than having written programs that get ported to radically different > systems - especially ones that you haven't heard of when you wrote the > code. And my code has been so ported, often without any changes needed. My point is that such weird systems no longer get designed. The world has standardized on 2-complement, 8-bit char, 32-bit int etc, and that is unlikely to change. Given that there isn't much variation possible. Putting in extra effort to allow for a theoretical system with sign-magnitude 5-bit char or a 31-bit one-complement int is completely insane. > |> It's only when you implement the standard you realise many of the issues are > |> irrelevant in practice. Take sequence points for example. They are not even > |> modelled by most compilers, so whatever ambiguities there are, they simply > |> cannot become an issue. > > They are relied on, heavily, by ALL compilers that do any serious > optimisation. That is why I have seen many problems caused by them, > and one reason why HPC people still prefer Fortran. It's only source-to-source optimizers that might need to consider these issues, but these are very rare (we bought one of the few still available). Most compilers, including the highly optimizing ones, do almost all optimization at a far lower level. This not only avoids most of the issues you're talking about, but it also ensures badly behaved programs are correctly optimized, while well behaved programs are still optimized aggressively. > |> Similarly various standard pendantics are moaning > |> about shifts not being portable, but they can never mention a compiler that > |> fails to implement them as expected... > > Shifts are portable if you code them according to the rules, and don't > rely on unspecified behaviour. I have used compilers that treated > signed right shifts as unsigned, as well as ones that used only the > bottom 5/6/8 bits of the shift value, and ones that raised a 'signal' > on left shift overflow. There are good reasons for all of the > constraints. > > No, I can't remember which, offhand, but they included the ones for > the System/370 and Hitachi S-3600. But there were also some > microprocessor ones - PA-RISC? Alpha? S370, Alpha and PA-RISC all support arithmetic right shifts. There is no information available on the S-3600. > |> Btw Do you happen to know the reasoning behind signed left shifts being > |> undefined while right shifts are implementation defined. > > Signed left shifts are undefined only if they overflow; that is undefined > because anything can happen (including the CPU stopping). Signed right > shifts are only implementation defined for negative values; that is > because they might be implemented as unsigned shifts. No. The standard is quite explicit that any left shift of a negative value is undefined, even if they there is no overflow. This is an inconsistency as compilers change multiplies by a power of 2 into a left shift and visa versa. There is no similar undefined behaviour for multiplies however. > |> It will work as long as the compiler supports a 32-bit type - which it will of > |> course. But in the infinitesimal chance it doesn't, why couldn't one > |> emulate a 32-bit type, just like 32-bit systems emulate 64-bit types? > > Because then you can't handle the 64-bit objects returned from the > library or read in from files! You're missing the point. A theoretical 64-bit CPU that only supports 64-bit operations could emulate support for 8-bit char, 16-bit short, 32-bit int. Without such emulation it would need 64-bit char, 128-bit short/int, 256-bit int/long in order to support C. Alpha is proof this is perfectly feasible: the early versions emulated 8/16-bit types in software without too much overhead. Once we agree that it is feasible to emulate types, it is reasonable to mandate that each implemenation supports the sized types. Wilco
From: Wilco Dijkstra on 18 Aug 2008 18:17
"Terje Mathisen" <terje.mathisen(a)hda.hydro.com> wrote in message news:ibudnfv81sCstDnVnZ2dnUVZ8tHinZ2d(a)giganews.com... > Wilco Dijkstra wrote: >> "Terje Mathisen" <terje.mathisen(a)hda.hydro.com> wrote in message news:V92dnbsbmsAAST7VRVnyvwA(a)giganews.com... >>> How many ways can you define such a function? >>> >>> The only serious alternatives would be in the handling of negative-or-zero inputs or when rounding the actual fp >>> result to integer: >>> >>> Do you want the Floor(), i.e. truncate, Ceil() or Round_to_nearest_or_even()? >>> >>> Using the latest alternative could make it harder to come up with a perfect implementation, but otherwise it should >>> be trivial. >> >> It was a trivial routine, just floor(log2(x)), so just finding the top bit that is set. >> The mistakes were things like not handling zero, using signed rather than >> unsigned variables, looping forever for some inputs, returning the floor result + 1. >> >> Rather than just shifting the value right until it becomes zero, it created a mask >> and shifted it left until it was *larger* than the input (which is not going to work >> if you use a signed variable for it or if the input has bit 31 set etc). >> >> My version was something like: >> >> int log2_floor(unsigned x) >> { >> int n = -1; >> for ( ; x != 0; x >>= 1) >> n++; >> return n; >> } > > <BG> > > That is _identical_ to the code I originally wrote as part of my post, but then deleted as it didn't really add to my > argument. :-) > > There are of course many possible alternative methods, including inline asm to use a hardware bitscan opcode. > > Here's a possibly faster version: > > int log2_floor(unsigned x) > { > int n = -1; > while (x >= 0x10000) { > n += 16; > x >>= 16; > } > if (x >= 0x100) { > n += 8; > x >>= 8; > } > if (x >= 0x10) { > n += 4; > x >>= 4; > } > /* At this point x has been reduced to the 0-15 range, use a > * register-internal lookup table: > */ > uint32_t lookup_table = 0xffffaa50; > int lookup = (int) (lookup_table >> (x+x)) & 3; > > return n + lookup; > } I like the lookup in a register method. I once did something like this: uint8 table[32] = { ... }; int log2_floor(unsigned x) { if (x == 0) return -1; x |= x >> 1; x |= x >> 2; x |= x >> 4; x |= x >> 8; x |= x >> 16; x *= 0x... // multiply with magic constant return table[x >> 27]; // index into table } The shifted OR's force all bits after the leading one to be set too. This reduces the number of possibilities to just 32. The multiply then shifts the magic constant by N bits. It is chosen so that the top 5 bits end up containing a unique bitpattern for each of the 32 possible values of x. It took 10 instructions plus 32 bytes of table. Placing the table immediately after the return instruction allowed the use of the LDRB r0,[PC,r0,LSR #27] instruction, so it didn't even need an instruction to create the table address... Wilco |