Software vs Hardware [Computer Architecture]

Prev: First Next-Gen CELL Processor: 2 PPEs - 32 SPEs - at least 1 Teraflop
Next: old microcode listings

From: Terje Mathisen on 15 Dec 2006 15:40

Jan Vorbr�ggen wrote:
>> It didn't even help to manually turn on the TCP_NODELAY (exact
>> spelling?) flag, because Microsoft in their infinite wisdom had
>> decided they knew better, and simply disregarded that flag if set. :-(
>
> The spelling is correct, and I believe that in this particular case,
> Microsoft have in the mean time seen the light and no longer disregard
> the setting. There are too many cases where without it, the application
> is functionally broken because latency is too high.
>
> Isn't also the 200 ms a really constant parameter that isn't subject to
> negotiation between the partners? I wonder whether the designers ever heard
> of the concept of scaling...

The 200ms Nagle is indeed one of the "constants" in the TCPIP universe,
dating back to when a fast typist would keep up at least 5-10
keystrokes/second: In those days a 200 ms gap was needed to make sure
that the current packet should be sent without waiting on more
terminal-type input data.

What really happens these days is that every single vendor has a big set
of (semi-)proprietary heuristics which they use to tune their runtime
performance, and every once in a while things go really bad because you
happened to hit slightly outside the expected envelope.

I.e. moving a server 150 km away increased performance by an order of
magnitude, because 4 ms ping times was inside the tuning range, while a
Gbit switch latency well below a ms was fast enough that we triggered
those 200 ms trainwrecks. :-(

Terje

--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

From: Terje Mathisen on 15 Dec 2006 15:48

Nick Maclaren wrote:
> In article <krq854-t9i.ln1(a)osl016lin.hda.hydro.com>,
> Terje Mathisen <terje.mathisen(a)hda.hydro.com> writes:
> |>
> |> Indeed. I have seen a few burn marks from the friction between
> |> independent implementations:
> |>
> |> It used to be that with AIX on one end and Windows on the other, i.e. a
> |> very common situation, you could (depending upon average request size
> |> and ping time) get Nagle style 200 ms delays between every pair of
> |> packets in either or both directions.
> |>
> |> It didn't even help to manually turn on the TCP_NODELAY (exact
> |> spelling?) flag, because Microsoft in their infinite wisdom had decided
> |> they knew better, and simply disregarded that flag if set. :-(
> |>
> |> The result did "work", it just took 2-3 orders of magnitude more time
> |> than really needed.
>
> One of the two directions between SunOS and HP-UX was like that: one
> had a low-latency stack and used very small windows; the other relied
> on large windows for performance.

We had 64 to 256 K windows in both directions, but that really didn't
matter at all: The real culprit was an application that did several
really bad things:

a) It used machine-generated, very much un-optimized, sql statements to
store several MBs of updates to a DB table.

b) Each UPDATE ... statement ended up a little over 2 KB long (to write
less than 100 bytes of data). This had to be split into two packets, one
full 1514 byte packet and a 500-700 bytes tail end.

c) Each UPDATE sent a single record and waited for an SQL-Net ACK before
the next! I.e. a 2.5 KB window would have been plenty since there was
absolutely no possibility of data streaming.

It still more or less worked, as long as the network latency was on the
order of a ms or more: Much longer than this, and the 100K to 200K
updates would take several minutes.

Having the application on the same switch as the DB resulted in 30-45
minutes for the same update which should have taken less than a second
with slighly less braindamaged logic. :-(

Terje
--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

From: sylvek on 15 Dec 2006 18:15

"Jan Vorbr�ggen" <jvorbrueggen(a)not-mediasec.de> wrote:

> Blind testing is the only way to exclude perceptual effects due to prior
> knowledge.

This is true, although the real gold standard is "double blind",
where both the test subject and the test proctor don't know
the identity of the device under test.

High-end audio industry is full of people trying to game the
testing protocol. Try searching for "Bendini ultraclarifier"
and "Tice clock" for the two prominent examples.

Back to Kyle and HardOCP "reviews" I actually applaud his
approach of injecting entertainment into rather boring field
of technology reviews. I still laugh remembering his review
of one of the Pentium processors where he unboxed it, had
a "private moment" with it and then had to wipe it out with
a Kleenex ;-). As an entertainment this was a good belly laugh,
if somewhat low-brow. Scientific experiment it wasn't.

I have nothing against people pimping the hobbyist electronics,
especially if they do it with a flair. But the unfortunate
side effect is that such industries tend to attract pompous
bores with no sense of humor. And that is such a shame.

I think that the whole "gaming PC" phenomenon is becoming
a copy of "high-end audio" niche with the same business
models and same marketing strategies.

S

From: Chris Thomasson on 19 Dec 2006 02:26

<rohit.nadig(a)gmail.com> wrote in message
news:1165876041.056672.101830(a)16g2000cwy.googlegroups.com...
> Hello Comrades,
>
> I propose the following Premise:
> It seems to me that much of the features that would benefit the average
> user of a computer could be better implemented in hardware than in
> software.

Hey! Give us software guys a break will ya?

;^)

Seriously thought, for a moment... There are actually "some algorithms" that
can be "implemented in a highly efficient manor, directly in *software* ".
IMHO, the hardware guys can move on to better things " --IF-- " the software
world actually invents something that performs so well on your existing
hardware that it basically renders a direct hardware-based implementation
meaningless...

For instance, a distributed message passing algorithm (e.g., 100%
compatiable with the Cell BE) that exhibits' simply excellent scalability,
throughput and overall performance characteristics can be implemented, in
software, right now.

So, if a software implementation of an algorithm 'X' can exhibit virtually
zero-overhead... Why should the hardware guys worry about engraving the
algorithm 'X' in silicon? They can move on to better things... No?

<thoughts that I should keep to myself>

-- You know, us software guys should be consultants to the hardware guys
whenever anything to do with cache coherency/synchronization issues arise...
Na...

</thoughts that I should keep to myself>

:O

From: rohit.nadig on 19 Dec 2006 04:30

> Hey! Give us software guys a break will ya?

No kidding. Its the same pie we that feeds both of our respective
communities (Hardware and Software). We (Hardware Designers and
Manufacturers) want a bigger share of the pie. More complex hardware
means more expensive chips.

Sometime in the early 200x years (I am guessing 2002 ), Microsoft had
bigger revenues and profits than Intel. Higher Margins are
understandable, but in the business world, anybody that is much bigger
than you is a threat. You want everybody in your ecosystem to be
smaller than you, but profitable and growing. So if you make an iPOD,
you dont want a product that docks the iPOD into a car to be more
profitable than the iPOD itself.

Its a subtle message that has pervaded many a corporation's strategies.
Sun mantra "The network is the computer" for long has been a a strategy
to push their agenda (selling many expensive servers).

I am guessing Intel is worrying about Google? At google's pace, they
will stream everything on the internet for FREE (funded by their ad
revenues)! You may not need a fast computer anymore, just a broadband
connection and a gmail account. What will that do to Intel's growth if
the future of computing is 3 or 4 big corporations with huge server
rooms?

At this point, the only way you can grow a microprocessor business is
by adding functionality simple because we have run out of "big" ideas
to improve ST performance. Improvements in ST performance are
incremental. Lets face it, over 60% of the money that people spend on
semiconductors probably pay for a microprocessor of sorts, and hence my
claim that the only way you can create value in new hardware designs is
by adding functionality.

> Seriously thought, for a moment... There are actually "some algorithms" that
> can be "implemented in a highly efficient manor, directly in *software* ".

I am sure there are many algorithms that work great in software. But I
am going to pitch the same thing to you. One of the most popular
software applications is a web browser. Shouldnt you guys be focussing
on the XML/XSLT/DOM standards more, and less on the video codecs (and
leave that implementation to us hardware guys).

> IMHO, the hardware guys can move on to better things " --IF-- " the software
> world actually invents something that performs so well on your existing
> hardware that it basically renders a direct hardware-based implementation
> meaningless...
>
> For instance, a distributed message passing algorithm (e.g., 100%
> compatiable with the Cell BE) that exhibits' simply excellent scalability,
> throughput and overall performance characteristics can be implemented, in
> software, right now.
>
>
> So, if a software implementation of an algorithm 'X' can exhibit virtually
> zero-overhead... Why should the hardware guys worry about engraving the
> algorithm 'X' in silicon? They can move on to better things... No?

I agree that MPI would be a good feature to implement in hardware, but
dont they have those myrinet switches that kinda do the same thing
(implement really fast crossbar network switching in hardware)?

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8
Prev: First Next-Gen CELL Processor: 2 PPEs - 32 SPEs - at least 1 Teraflop
Next: old microcode listings