Intel details future Larrabee graphics chip [Design]

Prev: LM3478 design gets insanely hot
Next: 89C51ED2

From: Joel Koltner on 19 Aug 2008 00:29

"Andrew Reilly" <andrew-newspost(a)areilly.bpc-users.org> wrote in message
news:6guvm1Fgfmr3U1(a)mid.individual.net...
> Sure, they're all fairly pure 16-bitters. I wouldn't call them new
> architectures, though, which is what I was getting at.

Ah, gotcha.

> For "modern", besides the C6000 series, I'd include the ADI/Intel
> Blackfin and the VLSI ZSP series.

I've looked a little at the Blackfin for an upcoming project, and it looks
pretty nice as far as I can tell.

> Not the 34010? I don't remember hearing about an 020 version.

Yep, there was a 34020 (the board I used paired a 34020 GSP with a 320C40
DSP... it wasn't uncommon that we confused the chip numbers!).

> Apart from the 8-times worse fan-out of the multiplexers, which might
> limit clock speed, I don't really see how this would be much worse than
> unaligned byte-addressed operations.

I don't think it was any worse than that, but still a significant penalty for
people looking for the ultimate in speed. (The thing ran at all of... 40MHz,
perhaps? ...seemed fast at the time...)

> I don't think that there are too
> many people interested in single-bit-deep graphics systems any more,
> though.

Yep, at least not by the time you hit QVGA or higher resolutions.

---Joel

From: Terje Mathisen on 19 Aug 2008 03:07

Wilco Dijkstra wrote:
> I like the lookup in a register method. I once did something like this:
>
> uint8 table[32] = { ... };
>
> int log2_floor(unsigned x)
> {
> if (x == 0)
> return -1;
> x |= x >> 1;
> x |= x >> 2;
> x |= x >> 4;
> x |= x >> 8;
> x |= x >> 16;
> x *= 0x... // multiply with magic constant
> return table[x >> 27]; // index into table
> }

That's one of the classic HAKMEM's isn't it?

It is indeed nice. :-)

Terje

--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

From: Nick Maclaren on 19 Aug 2008 04:40

I am getting tired of simply pointing out factual errors, and this
will be my last on this sub-thread.

In article <T4mqk.62353$8y1.30053(a)newsfe18.ams2>,
"Wilco Dijkstra" <Wilco.removethisDijkstra(a)ntlworld.com> writes:
|>
|> > |> It's only when you implement the standard you realise many of the issues are
|> > |> irrelevant in practice. Take sequence points for example. They are not even
|> > |> modelled by most compilers, so whatever ambiguities there are, they simply
|> > |> cannot become an issue.
|> >
|> > They are relied on, heavily, by ALL compilers that do any serious
|> > optimisation. That is why I have seen many problems caused by them,
|> > and one reason why HPC people still prefer Fortran.
|>
|> It's only source-to-source optimizers that might need to consider these
|> issues, but these are very rare (we bought one of the few still available).
|>
|> Most compilers, including the highly optimizing ones, do almost all
|> optimization at a far lower level. This not only avoids most of the issues
|> you're talking about, but it also ensures badly behaved programs are
|> correctly optimized, while well behaved programs are still optimized
|> aggressively.

I spent 10 years managing a wide range of HPC machines (and have advised
on such uses for much longer). You are wrong in all respects, as you
can find out if you look. Try Sun's and IBM's compiler documentation,
for a start, and most of the others (though I can't now remember which).

Your claims that it isn't a problem would make anyone with significant
HPC experience laugh hollowly. Few other people use aggressive
optimisation on whole, complicated programs. Even I don't, for most
code.

|> > |> Similarly various standard pendantics are moaning
|> > |> about shifts not being portable, but they can never mention a compiler that
|> > |> fails to implement them as expected...
|> >
|> > Shifts are portable if you code them according to the rules, and don't
|> > rely on unspecified behaviour. I have used compilers that treated
|> > signed right shifts as unsigned, as well as ones that used only the
|> > bottom 5/6/8 bits of the shift value, and ones that raised a 'signal'
|> > on left shift overflow. There are good reasons for all of the
|> > constraints.
|> >
|> > No, I can't remember which, offhand, but they included the ones for
|> > the System/370 and Hitachi S-3600. But there were also some
|> > microprocessor ones - PA-RISC? Alpha?
|>
|> S370, Alpha and PA-RISC all support arithmetic right shifts. There
|> is no information available on the S-3600.

All or almost all of those use only the bottom few bits of the shift.
I can't remember the recent systems that had only unsigned shifts, but
they may have been in one or of the various SIMD extensions to various
architectures.

|> > Signed left shifts are undefined only if they overflow; that is undefined
|> > because anything can happen (including the CPU stopping). Signed right
|> > shifts are only implementation defined for negative values; that is
|> > because they might be implemented as unsigned shifts.
|>
|> No. The standard is quite explicit that any left shift of a negative value
|> is undefined, even if they there is no overflow. This is an inconsistency
|> as compilers change multiplies by a power of 2 into a left shift and visa
|> versa. There is no similar undefined behaviour for multiplies however.

From the standard:

[#4] The result of E1 << E2 is E1 left-shifted E2 bit
positions; vacated bits are filled with zeros. If E1 has an
unsigned type, the value of the result is E1�2E2, reduced
modulo one more than the maximum value representable in the
result type. If E1 has a signed type and nonnegative value,
and E1�2E2 is representable in the result type, then that is
the resulting value; otherwise, the behavior is undefined.

[ E1�2E2 means E1 times 2 to the power E2 and got mangled in the text
version. ]

|> Once we agree that it is feasible to emulate types, it is reasonable to
|> mandate that each implemenation supports the sized types.

That is clearly your opinion. Almost all of those of us with experience
of when that was claimed before for the previous 'universal' standard
disagree.

Regards,
Nick Maclaren.

From: Nick Maclaren on 19 Aug 2008 04:58

In article <222180a4-a9d1-48c5-94ae-e8ae643b1a6a(a)v57g2000hse.googlegroups.com>,
already5chosen(a)yahoo.com writes:
|>
|> Byte addressability is still uncommon in DSP world. And no, C
|> compilers for DSPs do not emulate char in a manner that you suggested
|> below. They simply treat char and short as the same thing, on 32-bit
|> systems char, short and long are all the same. I am pretty sure that
|> what they do is in full compliance with the C standard.

Well, it is and it isn't :-( There was a heated debate on SC22WG14,
both in C89 and C99, where the UK wanted to get the standard made
self-consistent. We failed. The current situation is that it is in
full compliance for a free-standing compiler, but not really for a
hosted one (think EOF). This was claimed not to matter, as all DSP
compilers are free-standing!

|> > Putting in extra effort to allow for a theoretical system with
|> > sign-magnitude 5-bit char or a 31-bit one-complement int is
|> > completely insane.
|>
|> Agreed

However, allowing for ones with 16- or 32-bit chars, or signed
magnitude integers is not. The former is already happening, and there
are active, well-supported attempts to introduce the latter (think
IEEE 754R). Will they ever succeed? Dunno.

|> It seems you overlooked the main point of Nick's concern - sized types
|> prevent automagical forward compatibility of the source code with
|> larger problems on bigger machines.

Precisely.

Regards,
Nick Maclaren.

From: Wilco Dijkstra on 19 Aug 2008 05:52

"Nick Maclaren" <nmm1(a)cus.cam.ac.uk> wrote in message news:g8e0u6$kb$1(a)gemini.csx.cam.ac.uk...
>
>
> I am getting tired of simply pointing out factual errors, and this
> will be my last on this sub-thread.

Which factual errors? :-)

> In article <T4mqk.62353$8y1.30053(a)newsfe18.ams2>,
> "Wilco Dijkstra" <Wilco.removethisDijkstra(a)ntlworld.com> writes:

> |> Most compilers, including the highly optimizing ones, do almost all
> |> optimization at a far lower level. This not only avoids most of the issues
> |> you're talking about, but it also ensures badly behaved programs are
> |> correctly optimized, while well behaved programs are still optimized
> |> aggressively.
>
> I spent 10 years managing a wide range of HPC machines (and have advised
> on such uses for much longer). You are wrong in all respects, as you
> can find out if you look. Try Sun's and IBM's compiler documentation,
> for a start, and most of the others (though I can't now remember which).
>
> Your claims that it isn't a problem would make anyone with significant
> HPC experience laugh hollowly. Few other people use aggressive
> optimisation on whole, complicated programs. Even I don't, for most
> code.

And I laugh in their face about their claims of creating a "highly optimizing
compiler" that generates incorrect code! Any idiot can write a highly
optimizing compiler if it doesn't need to be correct... I know that many
of the issues are caused by optimizations originally written for other
languages (eg. Fortran has pretty loose aliasing rules), but which
require more checks to be safe in C.

My point is that compilers have to compile existing code correctly - even
if it is written badly. It isn't hard to recognise nasty cases, for example it's
common to do *(T*)&var to convert between integer and floating point.
Various compilers treat this as an idiom and use direct int<->FP moves
which are more efficient. So this particular case wouldn't even show up
when doing type based alias analysis.

> |> S370, Alpha and PA-RISC all support arithmetic right shifts. There
> |> is no information available on the S-3600.
>
> All or almost all of those use only the bottom few bits of the shift.

That is typical of all implementations, but it is not a big issue, and the
standard is correct in this respect.

> I can't remember the recent systems that had only unsigned shifts, but
> they may have been in one or of the various SIMD extensions to various
> architectures.

Even if you only have unsigned shifts, you can still emulate arithmetic
ones. My point is there is no excuse for getting them wrong, even if
your name is Cray and you can improve cycle time by not supporting
them in hardware.

> |> > Signed left shifts are undefined only if they overflow; that is undefined
> |> > because anything can happen (including the CPU stopping). Signed right
> |> > shifts are only implementation defined for negative values; that is
> |> > because they might be implemented as unsigned shifts.
> |>
> |> No. The standard is quite explicit that any left shift of a negative value
> |> is undefined, even if they there is no overflow. This is an inconsistency
> |> as compilers change multiplies by a power of 2 into a left shift and visa
> |> versa. There is no similar undefined behaviour for multiplies however.
>
> From the standard:
>
> [#4] The result of E1 << E2 is E1 left-shifted E2 bit
> positions; vacated bits are filled with zeros. If E1 has an
> unsigned type, the value of the result is E1�2^E2, reduced
> modulo one more than the maximum value representable in the
> result type. If E1 has a signed type and nonnegative value,
> and E1�2^E2 is representable in the result type, then that is
> the resulting value; otherwise, the behavior is undefined.

Exactly my point. It clearly states that ALL leftshifts of negative values are
undefined, EVEN if they would be representable. The "and nonnegative value"
excludes negative values! The correct wording should be something like:

"If E1 has a signed type and E1�2^E2 is representable in the result type, then
that is the resulting value; otherwise, the behavior is implementation defined."

Wilco

First | Prev | Next | Last
Pages: 22 23 24 25 26 27 28 29 30 31 32 33
Prev: LM3478 design gets insanely hot
Next: 89C51ED2