Itanium had appeal [Computer Architecture]

Prev: Interesting: Is IBM considering an OS/2 redo?
Next: arch ne age

From: jgd on 28 Apr 2010 18:34

In article <hr527d$hed$5(a)usenet01.boi.hp.com>, rick.jones2(a)hp.com (Rick
Jones) wrote:
> Anne & Lynn Wheeler <lynn(a)garlic.com> wrote:
> "(The Itanium chips had an x86 emulator, you will remember, and also
> emulated some PA-RISC instructions that HP-UX needed)"
>
> I have never been a HW guy, but I don't recall there being any sort of
> PA-RISC instruction emulation in Itanium chips.

There isn't. There are a few odd-looking instructions - details long
faded from memory, since I never hit compiler problems that used them -
that seem to have effects that map neatly onto unusual PA-RISC
instructions.

> There is the Aries PA-RISC emulator *SW* available with HP-UX to
> allow customers to run PA-RISC binaries.

That's a binary translator, and the PA-RISC-like Itanium instructions
are apparently provided mainly for its use.

Incidentally, the x86 hardware emulator on early Itanium chips was
largely useless. It was about one-third the speed of the same chip
running in native Itanium mode, and this slower than the binary
translator that turned x86 code into Itanium code and ran that, which
managed about half the throughput of the same chip running native
Itanium code. This produced a big problem with selling Itania to
corporate power users: it ran their existing software a lot slower than
they were used to.

The poor performance was largely because the x86 hardware emulator was
in-order, and getting good x86 performance already required getting
fancier than that. It was removed from, I think, McKinley onwards,
because the x86-to-Itanium binary translator was faster, and the chip
area had better uses.

--
John Dallman, jgd(a)cix.co.uk, HTML mail is treated as probable spam.

From: Michael S on 28 Apr 2010 19:17

On Apr 29, 12:34 am, j...(a)cix.compulink.co.uk wrote:
> In article <hr527d$he...(a)usenet01.boi.hp.com>, rick.jon...(a)hp.com (Rick
>
>. It was removed from, I think, McKinley onwards,

x86 HW presents both in McKinley and in Madison. Montecito is the
first Itanium processor that doesn't have it.

From: Rob Warnock on 28 Apr 2010 21:42

<nmm1(a)cam.ac.uk> wrote:
+---------------
| However, it is used as an argument to avoid considering (say) the
| interrupt-free architecture that I have posted on this newsgroup.
| That would be essentially transparent to 99% of applications, and
| need mainly reorganisation of the kernel and device drivers (not
| even a complete rewrite).
|
| That is a classic example of an idea that was thought of 40 years
| ago (probably 50+), but could not have been implemented then,
| because the technology was inappropriate.
+---------------

Actually, it was *quite* appropriate for certain classes of
problems/systems, even way back then!! For example, the "WSCHED"
operating system kernel I wrote for DCA[1] circa 1972 [that's *almost*
40 years ago! ;-} ] for the SmartMux/3xx series networking nodes ran
with interrupts *off*![2] And that included the "applications" [written
by others] which ran on top of that kernel [the various device drivers,
data-link protocols, multiplexors, routers, terminal protocol converters
(ASCII-to-2741, etc), interactive logins, etc.].

Of course, we *did* have to practice a rather stringent discipline when
writing the code: we had to insert an "@SERVICE" macro every 200 cycles
or less or once every loop iteration (which ever came first). The @SERVICE
macro was a very low overhead way[2] to test the hardware for needing
service and to call the scheduler if needed.

But if you had a system with a "trusted" system compiler [such as with
the Burroughs 5500 et seq], you could have the compiler automatically
insert the equivalent of @SERVICE at appropriate "safe" points, such as
many Lisp or Java compilers already do to handle Unix signals safely,
to poll the need to task-switch.

+---------------
| But it will not get reconsidered because it is heretical to the
| great god Compatibility.
+---------------

Not just the Great God Compatibility but also the Great God Feature!
There have been any number of times since the WSCHED days when I have
had to fight management or other programmers to *NOT* use interrupts
in some bit of embedded kit. "But you *have* to use interrupts! The
harwdare has them; that's what they're there for!!" In almost all cases,
however, I have managed to show that coding the kernel & applications
*without* interrupts would result in a simpler, faster, more reliable
implementation. This has included applications on AMD 29k, MIPS R4k,
and Intel XScale (ARM).

+---------------
| The fact that it might well deliver a fairly painless factor of two
| in performance, RAS and reduction of design costs is irrelevant.
+---------------

Yup.

+---------------
| Similarly, reintroducing a capability design (rather than the half
| baked hacks that have been put into some Unices) would be far less
| painful than is often made out, and could easily deliver a massive
| improvement in RAS - if done properly, perhaps a factor of 10 in
| the short term, 100 in the medium and thousands in the long.
+---------------

Heh. That's a whole 'noter story... ;-} [See the "Hydra/C.mmp" book.]

-Rob

[1] Digital Communications Associates in Atlanta, not the ".mil" agency.

[2] O.k., I'll 'fess up. In the original version of WSCHED that ran
on a DEC PDP-8/e, the @SERVICE macro *did* turn on interrupts, but
only for one instruction. @SERVICE was defined as "ION; CLA; IOFF",
that is, turn on interrupts, clear the accumulator, and turn interrupts
off again. It was less code and slightly faster than doing the I/O
instruction that tested for any interrupts pending and then then doing
a PDP-8 "cross-field" subroutine call to the micro-task scheduler.
That is, the totally pure interrupts-off version *worked*, and was
semantically indistinguishable from the "ION; CLA; IOFF" hack, except
that the latter was smaller/faster. So we used it. [So sue us. ;-} ]

-----
Rob Warnock <rpw3(a)rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607

From: Terje Mathisen "terje.mathisen at on 28 Apr 2010 23:37

Anton Ertl wrote:
> Terje Mathisen<"terje.mathisen at tmsw.no"> writes:
>> This means that any app which blindly writes the entire chain when
>> updating would only need to check the start of the chain.
>
> You lost me completely. The ALAT is about loads. If there is a chain
> of writes, none of them would have a corresponding check instruction.
> But I guess I misunderstand what you are trying to say. Could you
> give an example?

I'm bad at explaining this, sorry.

What I'm trying to say is that if you have code that creates a new
structure and links it into some sort of queue/stack/pipe, and another
process which consumes these structures/nodes, the ABA problem says that
you can get into a situation where a node is deleted and another node
created, and they end up in the same memory location:

In this case just verifying that the pointer/address is the same as when
you first checked it is not sufficient to verify that it hasn't been
modified in the meantime, right?

Using the ALAT any such modification/restoration would be detected, just
as if you had used LL-SC on that location.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

From: MitchAlsup on 29 Apr 2010 11:44

On Apr 28, 10:37 pm, Terje Mathisen <"terje.mathisen at tmsw.no">
wrote:
> Anton Ertl wrote:
> > Terje Mathisen<"terje.mathisen at tmsw.no"> writes:
> >> This means that any app which blindly writes the entire chain when
> >> updating would only need to check the start of the chain.
>
> > You lost me completely. The ALAT is about loads. If there is a chain
> > of writes, none of them would have a corresponding check instruction.
> > But I guess I misunderstand what you are trying to say. Could you
> > give an example?
>
> I'm bad at explaining this, sorry.
>
> What I'm trying to say is that if you have code that creates a new
> structure and links it into some sort of queue/stack/pipe, and another
> process which consumes these structures/nodes, the ABA problem says that
> you can get into a situation where a node is deleted and another node
> created, and they end up in the same memory location:

Perhaps the original context would help.

IBM 370 VMS had a long running program that would migrate seldom used
files from disk to tape. One day this program was running and decided
that it would migrate this file from disk to tape. The program had
begun the critical region (read the pointers) and was interrupted just
before the DCAS instruction*. The program did not regain control until
a month later. Much had gone on in the system and memory in the
meantime, and just by hapenstance, the bit patterns in the concurrent
data structure linkage locations just happened to have the same bit
patterns as when the program was interrupted. The return of control
caused the DCAS instruction to execute and PRESTO that data matched!
and then the program started to damage the new disk image based on its
old view of the data. OS died, and a significant portion of the disk
image needed to be recovered.

Just because the bit pattern matches does not mean that the concurrent
data structure is the same as when you last took a look a few
instructions ago! Those few instructions could be executed WEEKS
apart!

DCAS is supposed to be watching for interference of any kind, however,
matching bits is the only way it has to do this--which leads to the
need for cursors, timestamps and the like in order to observe the
interference.

In ASF we explicitly watch for interference, and all transfers of
control out of the current context are considered interference.

Mitch

(*) I don't remember the nemonic IBM used that is the equivalent of
DCAS

First | Prev | Next | Last
Pages: 18 19 20 21 22 23 24 25 26 27 28 29 30
Prev: Interesting: Is IBM considering an OS/2 redo?
Next: arch ne age