Prev: what happened CBM=VGA
Next: 1581 Drive Kits on eBay
From: Michael J. Mahon on 7 Jun 2006 14:18 Jorge ChB wrote: > mdj <mdj.mdj(a)gmail.com> wrote: > > >>Now that extremely cheap machines can manipulate high definition AV >>content in better than realtime, we're running out of reasons to make >>faster machines. >> >>This is a good thing, in many ways. > > > Wow wait ! > What are you saying ? > A good thing in what way ? > More MIPS + less Watts + smaller. > That's what hardware designers are after and will always be after. > That's been the key to now-possible before-unthinkable everyday things > like mobiles, ipods, psps, palms, dvb tv, ABS, EFI, portables, google > earth, WIFI, etc etc. A microprocessor everywhere. > And that's the key to many unthinkable wonderful new inventions that > have to come yet and won't be possible unless the hardware keeps > evolving == (More MIPS + less Watts + smaller) Absolutely right. But the game is about to get much harder, in the sense that silicon feature sizes are already approaching the point where silicon looks like swiss cheese, making further improvements by simple scaling quite difficult. The game is getting harder and *much* more expensive, so progress is slower (note the lack of 2x speed increases for the last few years) and the number of different players is decreasing. The big "open door" opportunity is multiprocessor parallelism, but we have invested so little in learning to apply parallelism that it remains esoteric. (But AppleCrate makes it easy to experiment with! ;-) The popular "thread" model, in which all of memory is conceptually shared by all threads, is a disaster for real multiprocessors, since they will *always* have latency and bandwidth issues to move data between them, and a "single, coherent memory image" is both slow and wasteful. > The only lack "of reasons to make faster machines" I can think of comes > from the fact that the software is evolving at a so *much* slower pace > (than hardware)... > Voice recognition ? > "Artificial Intelligence" ? > User (human) interface ? > etc... :-( Yes, there is much to be done--and very slow progress. What is needed is breakthrough *algorithmic* work, not *tools* work. -michael Parallel computing for 8-bit Apple II's! Home page: http://members.aol.com/MJMahon/ "The wastebasket is our most important design tool--and it is seriously underused."
From: mdj on 7 Jun 2006 21:37 Jorge ChB wrote: > Wow wait ! > What are you saying ? > A good thing in what way ? > More MIPS + less Watts + smaller. In general, more MIPS = more watts. In the past we focussed on more MIPS, more or less regardless of watts because we needed the MIPS more. Now, MIPS per watt is critical. Mobility demands it, as does server room space/heat issues, as does embedded systems. The work we want to do is achievable with current levels of processing power, but not current levels of power consumption. Considering the current state of software development, it's a good thing. We need to spend some time focussing our efforts of exploiting other means of performance enhancement, and focus more on software quality. Many of the foreseeable tasks for faster machines require many orders of magnitude more processing power that we currently have, so the focus will have to be on software improvement. I say about time! Matt
From: mdj on 7 Jun 2006 22:56 Michael J. Mahon wrote: > The big "open door" opportunity is multiprocessor parallelism, but > we have invested so little in learning to apply parallelism that it > remains esoteric. (But AppleCrate makes it easy to experiment with! ;-) Parallelism is the big door, but I think the approaches that need to be explored cover a wider gamut than multiprocess parallelism, which as you point of has considerable latency issues. > The popular "thread" model, in which all of memory is conceptually > shared by all threads, is a disaster for real multiprocessors, since > they will *always* have latency and bandwidth issues to move data > between them, and a "single, coherent memory image" is both slow > and wasteful. It is however an extremely efficient form of multiprocessing for applications with modest horizontal scaling potential. There's essentially 3 basic models for parallelism that must be exploited: Multithread - in which one processor core can execute multiple threads simultaneously Uniform Memory Multiprocessor - in which many processsor cores share the same physical memory subsystem. Note that this is further divided into multiple cores in the same package, plus other cores in different packages, which have very different latency properties. Non Uniform Memory Multiprocessor - In this case the latency can vary wildly depending on the system configuration. Modern multiprocessor servers employ all three approaches, both on the same system board, plus via high speed interconnects that join multiple system boards together. OS's must weight the 'distance' to another CPU when considering a potential execution unit for a process. What's slow and wasteful depends a great deal on the task at hand. Multithreading used to be just as expensive as multiprocessing. But consider a current generation CPU designed for low power, high concurrency, the UltraSPARC T1. These units have execution cores cable of running 4 concurrent threads. In the highest end configuration, there are 8 of these execution cores per physical processor. The cores have a 3.2GB/s interconnect. Each physical processor has 4 independant memory controllers, so you have non-uniform memory access on the one die. Peak power consumption for this part is 79W at 1Ghz. Considering you can in theory run 32 threads simulaneously, that's pretty impressive. How well you can exploit it depends on your application. An 'old school' web server for instance, can only get 8 way parallelism on this chip. A new school web server written in Java, can get 32 way, assuming at any given time there is at least 32 concurrent requests for the same dynamic page, or 32 static requests. It's getting to the stage where the power consumed by driving I/O over a pin on an IC package is significant, so expect to see systems like this grow in popularity. Interesting, you can download a VHDL description of this part from Sun, and synthesise it on one of the higer end FPGA's. Oh how I wish I had access to hardware like that! A top of the range Sun server uses parts that have 4 execution threads per core, four cores per board, each with it's own memory controller+memory, and up to 18 boards per system (coupled together by an 9GB/s crossbar switch). Exploiting all the resources in this system and doing it efficiently is *hard*, as it employs every different style of parallelism I mentioned before within the same 'machine'. And I haven't even considered computing clusters! The way it's panning out is that real multiprocessors are a disaster for parallelism. The problem is that essentially any task that can be parallelised needs to process the same data that it does in serial form. Because of this, you can utilise the same buses, I/O subsystems, and take advantage of 'nearness' to allow some pretty incredible IPC speeds. Multithreading approaches are very important on these systems. In fact, multithreading is important even on systems with single execution units. The gap between I/O throughput and processing throughput means you get a certain degree of 'parallelism' even though you can only run one thread at a time. Free performance improvement if you employ parallel design techniques. Of course, there are certain heavily compute-bound applications where the degree of IPC is very low, and massive parallelism is possible regardless of the interconnect used, as IPC constitutes a relatively small part of the workload. For the rest of the cases though where lots of data is being consumed, systems that allow low-overhead IPC through multithreading are the way to go. Matt
From: mdj on 7 Jun 2006 23:36 Michael J. Mahon wrote: > And most organizations give maintenance tasks to new programmers, as > a kind of hazing, I think! > > But not supporting the code that you produced, at least for its first > year in the field, deprives a team of the *real* learning experience, > in which you discover which of your grand ideas worked and which didn't. > > And it also serves as a test of whether the code is *actually* > maintainable, as opposed to theoretically maintainable. > > I see doing at least "early" maintenance as a kind of accountability. Fully agree. All of the applicable data that needs to be fed into the improvement process comes directly from supporting the code. Severing the connection between the development group and this process is most unwise, as it's these people that need to come up with ways of improving this process. That's where you get real efficiency. Of course the problem is that if your development team is spending its time doing support, it doesn't have the time to develop new versions of code. This is where the the process I outlined previously comes in. Once the support issues around the product stabilise, you bring in short-term resources to handle that support, free up the development team, and start the process again. > As we enter the era of 10 million transistor FPGAs, system compilers, > and "turnarounds" measured in seconds--in short, as the constraints on > hardware design are eased--I expect to see many of the same problems > that have afflicted software shift into the "hardware" realm. > > Discipline is hard-won. Discipline can only coexist with ease and > convenience *after* it has been formed through hard experience, since > ease puts greater demands on discipline. > > Tools can give the appearance of discipline by restricting expression, > but to a truly disciplined mind, tools are merely secondary. > > I think of "strict" tools as "discipline for the undisciplined", but > so much of system design is outside the realm of any formal tools, that > there is no substitute for design discipline. A terrible fate awaits > those who think that there is. It's also premanaged complexity, for those that have neither the time or the resources to mangage it themselves. Tools don't necessarily have to be strict, just work, and provide access to complex functionality that's already proven. This is where tool and language evolution is key. This complexity goes up all the time, while human discipline evolves slowly and has real, known limits. In order to build more complex systems, more complex toolsets that encapsulate that complexity must be employed. > My second software tools phase was strict typing and enforced structure. > My mantra was, "If you think you need a macro, then something is missing > from the language." Experienced programmers chafed at the "training > wheels" the language forced upon them. Some of them filled their code > with unstructured "workarounds", perhaps a sign of their resentment at > the strictures of the programming environment. (Unstructured code can > be written in any language.) Been there too. Time has proven it doesn't work well, and that test driven development techniques provide more safety, and allow more flexible forms of expression in the process, easing chafing. > My third software tools phase was "the only thing that matters is > the team". I strove for a small team of 98th percentile people, who > implicitly understood the need for and benefits of discipline, and > who had learned this by experience. Tools are useful, but secondary. > If a tool is really needed, it will be written. (Structured code can > be written in any language.) > > Although I don't consider any of the three approaches ideal, there > is no doubt that the third worked the best, both in terms of team > esprit and in terms of product quality (function & reliability). > > Don't count too much on tools--it's the people that make the real > difference. The problem is such teams are very hard to build, and keep. And often the 98th percentile people are already consumed by the very companies that produce the technology you're trying to leverage. It's really up to say, the 90th percentile group to manage the complexity for the rest, and provide it in more accessible forms, through tools that support higher abstractions, allowing more to be done. It's certainly not ideal either, and it shifts a lot of 'waste' onto the machines. But this is the only place you can feasibly put it, because the machines are cheap and get bigger all the time. The humans on the other hand.... > > This is the principle reason for evolving languages and tools. Improved > > langugages allow ideas to be expressed more concisely, support > > encapsulation mechanisms that allow complex modules to reused, thus > > allowing complexity to be more effectively managed. Sure it's > > idealistic to expect new tools solve all the problems, they don't. They > > do however mitigate some of the old issues and allow some progress to > > be made. > > For balance, I have to point out that they also permit *needless* > complexity to be more effectively managed. When "Hello, World!" > executes 8 megabytes of code, you know something has gone sour. > (And, yes, I do include *all* the code executed, not just the code > in the "Hello, World!" module.) Sure. Of course, it bares pointing out that you're referring to 8mb of code that Hello World won't execute, but will carry around as a payload anyway. Runtime systems are getting larger that's true, but they also only have to be loaded once, thanks to copy on write memory, and much of the initialisation work can be cached and shared amongst running applications. Over time the issues bought about by this approach are being mitigated, and besides, it's good fun work finding ways to tune it out. It's not ideal, but what's the alternative? If you don't follow this road, a cap is placed on the possible solutions you can build. The overheads introduced by high level abstraction systems is a very interesting field of research, and one that great inroads into managing has been made. I can see a time when massive parallel computing clusters 'churn' through algorithms, fitting them to particular machines are problem domains. Programming done by the human will be not much more than assembling from vast libraries of prevalidated solutions. It doesn't take much thinking into the future to imagine a time when software complexity is so high that this is the only feasible solution to building more complex systems. I think we're more or less on the right track, but breaking the ties to legacy implementations that simply cannot be scaled in this way is one of the biggest hurdles to moving further towards solving the current issues in software design. Matt
From: Michael on 8 Jun 2006 01:31
mdj wrote: > Paul Schlyter wrote: > > > What "Java portability" ???? > > > > Java is not a portable language. Java is less portable than both > > FORTRAN, C or C++, which runs on several platforms. Java runs on one > > single platform only: the Java platform. > > Sorry, this isn't true. The Java language specification is quite > deliberately void of any language construct that would bind it, or any > program written in it to a specific architecture. The key concepts that > are missing here are pointers, It has pointers, just not accessible by the user. i.e. null reference > and more specifically, the ability to > perform arbitrary arithmetic on pointer types. Additionally, the > language specification defines EXACTLY the size and precision of each > data type. C and C++ on the other hand, not only allow arbitrary > pointer arithmetic, but also only define in the standard, the minimum > size requirements of each data type. You say that, as if it was a bad thing. The problem is one of size vs speed, and ease of serialization, which is why C99 added int#_t, int_fast#_t, int_least#_t While most code doesn't need to know the bit size of types, you still need to know the min sizes, so you don't have to worry about underflow / overflow. The size issue comes up when serializing. By the language mandidating features, even if the hardware doesn't support them, say like doubles on DSPs, or the PS2, is one of the reasons Java is so slow. See: "How Java's Floating-Point Hurts Everyone Everywhere" http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf Maybe you have a different experience on "portability" you can comment on, compared to Carmack's (May 2006) one with Java and cell-pohones? http://www.armadilloaerospace.com/n.x/johnc/Recent%20Updates It turns out that I'm a lot less fond of Java for resource-constrained work. I remember all the little gripes I had with the Java language, like no unsigned bytes, and the consequences of strong typing, like no memset, and the inability to read resources into anything but a char array, but the frustrating issues are details down close to the hardware. The biggest problem is that Java is really slow. On a pure cpu / memory / display / communications level, most modern cell phones should be considerably better gaming platforms than a Game Boy Advanced. With Java, on most phones you are left with about the CPU power of an original 4.77 mhz IBM PC, and lousy control over everything. I spent a fair amount of time looking at java byte code disassembly while optimizing my little rendering engine. This is interesting fun like any other optimization problem, but it alternates with a bleak knowledge that even the most inspired java code is going to be a fraction the performance of pedestrian native C code. Even compiled to completely native code, Java semantic requirements like range checking on every array access hobble it. One of the phones (Motorola i730) has an option that does some load time compiling to improve performance, which does help a lot, but you have no idea what it is doing, and innocuous code changes can cause the compilable heuristic to fail. Write-once-run-anywhere. Ha. Hahahahaha. We are only testing on four platforms right now, and not a single pair has the exact same quirks. All the commercial games are tweaked and compiled individually for each (often 100+) platform. Portability is not a justification for the awful performance. Cheers |