From: Mayan Moudgill on 22 Oct 2009 11:45 Andy "Krazy" Glew wrote: > (3) Recall that I am a fan of skip-ahead, speculative multithreading > architectures such as Haitham Akkary's DMT. If you can't predict a > branch, skip ahead to the next loop iteration or function return, and > execute code that you know will be executed with high probability. I was wondering - how much of the DMT performance improvement is becauase of all the speculative execution, annd how much of it is because it's acting as a I-cache prefetch engine? IIRC, the performance numbers for some of the non-linear I-prefetch schemes seem to track the performance improvements reported by DMT.
From: Stephen Fuld on 22 Oct 2009 13:11 Anton Ertl wrote: > Stephen Fuld <SFuld(a)alumni.cmu.edu.invalid> writes: >> Paul Wallich wrote: >>> I would be that a huge chunk of the time isn't in doing the actual >>> calculations but in verifying that the calculations can be done. >>> Spreadsheets are pretty much the ultimate in mutably-typed interactive >>> code, and there's very little to prevent a recalculation from requiring >>> a near-universal reparse. >> Wow, I hadn't thought of that. But if you are say running multiple >> simulation runs, or something else where the only thing changing is the >> value of some parameters, not the "structure" of the spreadsheet, does >> Excel understand that it can skip at least most of the reparse? > > Probably not, because for most users Excel is fast enough even with > slow algorithms. And those where it isn't, they have probably > invested so much in Excel that most of them would not change to a > spreadsheet program with better algorithms even if one is available. > So there is no incentive for other spreadsheet programs to improve > their algorithms, and therefore also no incentive for Excel. I understand that. But isn't say the spreadsheet in Open Office supposed to be compatible with Excel? If so, the conversion shouldn't be hard and might be worth it if it sped up a long recalc. But see below. > Concerning the structure of the spreadsheet, this changes only cell by > cell, so any parsing should only have to deal with a cell at a time. > Or if you have operations that deal with many cells (say, copying a > column or loading a spreadsheet), it's reasonable that the time taken > is proportional to the size of the change; and these operations are > not the most frequent, so it's not so bad if they take a little time > on huge spreadsheets. That all makes sense, but I am bothered by what seems two contradictory ideas here. Andy says the problem is an algorithm Excel uses that is O(N**2) compared with one he could write himself that is perhaps O(N) or O(N log N). I presume this algorithm isn't in the parsing but in the calculation themselves. On the other hand, Paul said the bulk of the time is probably in the parsing rather than the recalculation of the parsed results. Andy didn't specify which algorithms he found that were suboptimal in Excel, so it is hard to evaluate how difficult it would be to fix them. But in the past vendors have made changes to their products that were encountered only as increased usage and faster CPUs and more memory were available that made solving large problems more common. The best example is when spreadsheets increased the maximum number of rows and columns allowed in a spreadsheet. Similarly, they introduced multiple pages within a single spreadsheet. So there is some precedent for vendors to fix things that don't scale well. -- - Stephen Fuld (e-mail address disguised to prevent spam)
From: Stephen Fuld on 22 Oct 2009 13:15 Andy "Krazy" Glew wrote: > Stephen Fuld wrote: >> Does the way your spreadsheet works force serial calculations? I.e. >> are almost all the cells that are to be recalculated dependent upon >> the previous one, thus forcing a serial chain of calculations. Or are >> there "multiple chains of dependent cells" that are only serial due >> to the way Excel itself is programmed? If the latter, one could >> enhance Open Office to use multiple threads for the recalcs which >> would take advantage of multiple cores for something useful. > > No. Could be parallel. > > But, it is not really a case of parallelization. In this case "the > Excel way" gives O(N^2) work, whether parallel or non, versus O(N) (or > maybe O(N log N), depending on assumptions), whether parallel or not. While I obviously agree that algorithm trumps parallelism as the problem size grows, it appears that you are in a problem size region where parallelism may be good enough. If you could bring a say 5 minute recalc to 2 minutes by using four cores, that seems like a very worthwhile effort. Though I certainly admit that it is a short term fix if the problems sizes continue to grow. -- - Stephen Fuld (e-mail address disguised to prevent spam)
From: Daniel A. Jimenez on 22 Oct 2009 13:22 In article <4ADFDC40.6060001(a)patten-glew.net>, >[stuff deleted] >Branch prediction: > >(1) branch predictors *have* gotten a lot better, and will continue to >get better for quite a few more years. Seznec's OLGEH predictor opened >up a whole new family of predictors, with extremely long history. The >multilevel branch predictor techniques - some of which I pioneered but >did not publish (apart from a thesis proposal for the Ph.D. I abandoned >when my daighter was born; which Adam Butts also pushed; and which >Daniel Jimenez published in combination with his neural net predictor - >provide a way in which you can get the accuracy of a big predictor, but >the latency of a small predictor. > >Daniel later recanted, publishing a paper saying the microarchitecture >complexity of handling late arriving branch predictions was not >warranted. I disagree, in part because it uses techniques very similar >to those such as Willamette's replay pipeline, which was not well known >when Daniel did his Ph.D. > >Since I know Daniel reads this newsgroup, perhaps he would care to say >what he thinks about multilevel branch prediction now? You're refering to my HPCA 2003 paper "Reconsidering Complex Branch Predictors" where I show how to ahead-pipeline a simple stupid predictor to achieve better performance than an overriding predictor (a combination of a small and fast predictor with a highly accurate but slow predictor). Yeah, I was a little frustrated with my previous work on what you call multi-level predictors. At the time, my perceptron (neural net) predictor was really too slow to be of any use in an overriding configuration when compared with a pipelined predictor. If you have to wait several cycles before deciding that the branch predictor might be wrong and you want to go the other direction, you might as well have used a simpler but larger pipelined predictor. However, it's fine as long as the second-level predictor only takes very few cycles. Later that year for MICRO 2003 I figured out how to ahead-pipeline the perceptron predictor in a way that eliminated the bothersome latency and happened to improve accuracy as well. Last year for MICRO 2008 we presented another neural-net based predictor using analog circuits to get around the latency problem. (By the way, Seznec's O-GEHL is basically a neural-based predictor with some funny hash functions; we were doing crazy long histories with the perceptron predictor before that, and Marius Evers was suggesting it before I came along. However, Seznec's more recent L-TAGE predictor is a nice breakthrough resulting from our competition in the mid-2000s.) -- Daniel Jimenez djimenez(a)cs.utexas.edu "I've so much music in my head" -- Maurice Ravel, shortly before his death. " " -- John Cage
From: Robert Myers on 22 Oct 2009 14:08
On Oct 22, 7:18 am, Noob <r...(a)127.0.0.1> wrote: > Robert Myers wrote: > > Open source depends on gcc, perhaps the cruftiest bit of code > > on the planet. > > What kind of unsubstantiated BS is this? What kind of a question is that? Robert. |