Prev: Check out POASM
Next: Bad habits
From: randyhyde@earthlink.net on 5 Feb 2006 13:04 Charles A. Crayne wrote: > On 3 Feb 2006 18:56:39 -0800 > "randyhyde(a)earthlink.net" <randyhyde(a)earthlink.net> wrote: > > :Now let's consider *every* line of RosAsm code ever written. Is that a > :sufficient amount? > > Insufficient data. However, the 250,000 lines that you mentioned > previously would certainly qualify. What do you like for a productivity > figure for such work? Even at 500 lines/day, the project would take 500 > days. Now, if a tool could translate 80% of those lines, the savings would > be 400 days. If it translated them *perfectly* and you didn't have to review *each line*, that would be the savings. If the remaining 20% was all in one spot, rather than interspersed throughout the code, you might get those savings. If the correctness of that 80% didn't depend on the other 20%, you might get those savings. Shall I go on? Surely, someone with the experience you claim to have doing software conversions realizes the fundamental flaw with your simplistic claim here? > > :Download the demo and give it a shot. > > Impractical, in my case, since I don't do Windows, anymore. However, I did > download and read the manual. And all your experience over all these years didn't tell you not to buy into the marketing hype? > > :No offense, but when the program breaks on something as simple as this, > :I think it's fair to forgive me for not having a whole lot of > :confidence in the operation of the rest of the program. > > You didn't think that 'jmp jmpTbl[eax*4-8]' was a simple thing, in your > previous posts, when you were using a similar construction as the number > one reason why such a tool is impractical. Uh, you're confused. The construct I tested the code with had *nothing* to do with the earlier example I gave as a problem. If you don't understand the difference between the two examples, then as Rene might say, you should learn a little more 80x86 assembly langauge programming. The example I gave to PortASM86 is a *very* typical switch statement implementation. Indeed, the PPC code it generates for that jmp (as I would expect) is quite correct. It is the fact that it can't handle the *labels* that is a bit of a concern here. Just so you realize this, there is a *big* difference between: jmp jmpTbl[eax*4-8] and mov eax, jmpTbl[eax*4] sub eax, 8 jmp eax If you don't understand the difference, you need to think about this problem for a while. The former is fairly easy to convert (at least across 32-bit architectures). The latter is extremely difficult to convert. > In addition, you should not > leap to judgement before trying the jump table hint described in the > manual. All they ask is that you tell the tool the beginning and end of > the jump table. In a typical Win32 assembly program (e.g., RosAsm) there are going to be a *ton* of indirect jumps (e.g., each call to the Win32 API). Does the phrase "cry wolf" having any meaning to you? The fact that there will be a ton of false warnings is going to *tremendously* increase the workload on that 80% of the code that was converted correctly. And once again, I point out, if you've got to make modifications to the original source code to make it translate correctly, you've got problems. Such conversions are *bound* to introduce new defects. And, again, you're left with the fact that you have to maintain *two* copies of the code when the translation is complete, because you'll never be able to retranslate the original code without repeating all the work. > > :IIRC, the documentation claims that the program handles the parity > :flag. > > Nor have you shown any evidence that it doesn't, as the conditional jumps > seem to have been generated correctly. It would be interesting to see if a > instruction must set hint would clean up the 'unknown register' issue. Please tell me what that code does, then? What is all that extra cruft (which I *seriously* doubt is syntactically correct)? Even still, why do I even have to look at this. Again, if this is part of the 80% of the code that translated correctly, I still have to look at it. So the savings you're claiming just don't follow. > > :I know > :nothing about NCR machines, so I have no idea how well NCR instruction > :semantics map to IBM 360 instructions. > > Very poorly. To begin with, where the basic addressability of the IBM 360 > was four 8-bit bytes within a 32-bit word, NCR used 12-bit slabs, each > of which contained either two 6-bit characters, or three 4-bit binary > coded decimal digits. It had no registers, and no floating point > instructions. We had to write data conversion programs for each > application system, and transfer the data via 7-track tapes, with no > tape labels. The list goes on and on. IOW, it was a lot of reengineering, not simply a "conversion". That's quite a bit different from what Rene (or even MicroAPL) is proposing. Different problems. There is *no question* that an x86 program can be reengineered to the PowerPC (or some other processor). There is also no question that you cannot write some tool (e.g., PortASM/86) to help with the job. But it's not going to do 80% of the work for you. > :BTW, I notice that MicroAPL has an assembly->C converter. I wonder how > :well it works (not having downloaded anything to try it out). > > I, too, am very interested in knowing how good it is, as I have always > considered such a tool to be quite difficult. Actually, back when I looked at the problem (two years ago), the conversion to C was actually a bit easier than the conversion to PowerPC. But the result was still too big and slow to be of practical use. Effectively, what you wound up doing, was calling a bunch of C functions that emulated each of the machine instructions. Even with data flow analysis, the result was going to be impractical. The few examples on MicroAPL's web site seem to suggest that miracles occur. But, of course, the examples they give are the ones where the tool shines and they don't bother posting the code where the problems arise. > Of course, as you know, > merely converting it to C does not, by itself, make it portable. Certainly there are OS issues and other problems. But I will suggest this: converting it to C is going to making it a *heck* of a lot more portable than conversion to some other assembly language. The problem with the conversion process, and why the conversions are so ugly, is because the programming paradigm for C, PowerPC assembler, and other languages is quite a bit different from x86 assembly. E.g., you do *not* write PPC assembly the same way you write x86 assembly. Any attempt to convert x86 assembly on a line by line basis to PPC assembly (or any other language) is going to produce bloat like you wouldn't believe. The few simple examples I posted in the last posting should demonstrate this. To someone who knows PPC assembly language, it's obvious that code is "not right." For example, you *don't* access memory left and write when writing RISC code. Yet a straight-forward conversion of x86 to PPC does exactly this. Even though there are lots of registers available to avoid this problem. Ditto for C. You don't have things like a stack in C, so you don't program in C the same way you do in assembly language (which is one of the benefits of using assembly in the first place). Any attempt to convert x86 assembly to C (or other HLL) is going to run into this problem. So even though a semantically faithful conversion (automatic) is theoretically possible, the result is not practical. And PortASM/86 doesn't come close to doing the conversion automatically for you. And even *after* you spent the considerable effort converting the result to the target processor, what you've got is a bunch of highly inefficient code that will be difficult for a PPC programmer to maintain. Cheers, Randy Hyde
From: Charles A. Crayne on 6 Feb 2006 00:54 On 5 Feb 2006 10:04:18 -0800 "randyhyde(a)earthlink.net" <randyhyde(a)earthlink.net> wrote: :If it translated them *perfectly* and you didn't have to review *each :line*, that would be the savings. If the remaining 20% was all in one :spot, rather than interspersed throughout the code, you might get those :savings. If the correctness of that 80% didn't depend on the other 20%, :you might get those savings. Shall I go on? One can easily imagine the die-hards of yore making the same objections to the world's first compiler. "Why do you want us to add an additional step to the development process? Why should we have to write our programs in some weird, restrictive language, when we have to review *each line* of the compiler output? Now we are going to have to maintain two different source files. It would be faster to just throw away the compiler, and write the program directly in assembler. Anyone who knows assembly programming can see immediately that the compiler generated code looks strange. . . ." However, as history has shown, with all their warts, compilers are a fact of life. Yes, they do have bugs. Yes, the compiled code is hard to follow. Yes, the use of HLLs makes it easy to write bad code. And yes, a highly skilled assembly programmer can write more efficient code. And yet, we do not routinely review the output of a compiler; we do not maintain separate HLL and assembly source files; it does not take significantly longer to write and debug a program in an HLL, than in assembler; and, for the most part, nobody cares about the relative performance. The fact of the matter is, that with the exception of a few of us hobbyists, decisions about development languages and tools is made based not upon technical considerations, but rather upon such business considerations such as delivery dates and return on investment. -- Chuck
From: randyhyde@earthlink.net on 6 Feb 2006 14:53 Charles A. Crayne wrote: > On 5 Feb 2006 10:04:18 -0800 > "randyhyde(a)earthlink.net" <randyhyde(a)earthlink.net> wrote: > > :If it translated them *perfectly* and you didn't have to review *each > :line*, that would be the savings. If the remaining 20% was all in one > :spot, rather than interspersed throughout the code, you might get those > :savings. If the correctness of that 80% didn't depend on the other 20%, > :you might get those savings. Shall I go on? > > One can easily imagine the die-hards of yore making the same objections to > the world's first compiler. "Why do you want us to add an additional > step to the development process? Why should we have to write our programs > in some weird, restrictive language, when we have to review *each line* of > the compiler output? Now we are going to have to maintain two > different source files. It would be faster to just throw away the > compiler, and write the program directly in assembler. Anyone who knows > assembly programming can see immediately that the compiler generated code > looks strange. . . ." You're going off the deep end here, Chuck. The complaints against the first compilers were of two varieties: (1) Machines are too expensive to allow any inefficiencies to creep in, and (2) Machines are too expensive to waste valuable time doing clerical things like compilation (or even assembly, in earlier cases). The issue you mention was *never* brought up to my knowledge. > > However, as history has shown, with all their warts, compilers are a fact > of life. Yes, they do have bugs. Yes, the compiled code is hard to follow. People don't have to maintain compiler output. That's a *big* difference. The output of a translator is going to have to be manually maintained. Surely, someone with all the years of experience you possess would understand this, right? > Yes, the use of HLLs makes it easy to write bad code. What does this have to do with maintaining the output of a translator separately from the original x86 code? > And yes, a highly > skilled assembly programmer can write more efficient code. What does this have to do with maintaining the output of a translator separately from the original x86 code? > And yet, we do > not routinely review the output of a compiler; But you *will* have to review the output of the translator. Because it does not produce semantically correct code. And we're not talking about simple "bugs" in the compiler here. We're talking about design decisions on the part of the program's designer not to support certain features. > we do not maintain > separate HLL and assembly source files; But you will have to maintain separate x86 assembly and PPC (or whatever) assembly files. That's the whole point here. > it does not take > significantly longer to write and debug a program in an HLL, than in > assembler; and, for the most part, nobody cares about the relative > performance. If you create a program that is *guaranteed* to produce semantically correct code from an x86 assembly language source, I can promise you that you *will* care about the performance. It starts to look pretty bad when you've got to carry around an emulator as part of the translated code. And while a combination emulator/translated package *may* outperform code that is strictly emulated, neither scheme is going to be anywhere close to the performance of the original x86 code (or a reasonable manual port to the PPC, or whatever). This is *exactly* why I gave up on the project. No automatic translation is possible that wouldn't wind up with something like an order of magnitude loss of performance. *That* is a big deal and people *would* care about it. > > The fact of the matter is, that with the exception of a few of us > hobbyists, decisions about development languages and tools is made based > not upon technical considerations, but rather upon such business > considerations such as delivery dates and return on investment. And the business decision would probably be that it's cheaper, faster, and far less expensive to port the assembly code to C and work in C from that point forward. There is a reason assembly language is *far* less popular than it used to be. Portability is one of those main reasons. And attempting to waste time cleaning up code that was converted from x86 assembly to some other assembly language is a waste of time that could have been put to more effective use porting the code to some portable HLL. Particularly when you consider the fact that the performance of the result will be so low. Keep in mind, CPU speeds are not increasing as they used to. We can no longer count on the next generation's CPU speed to cover up an order of magnitude performance drop because of sloppy coding (or translation). Cheers, Randy Hyde
From: Charles A. Crayne on 6 Feb 2006 22:15 On 6 Feb 2006 11:53:26 -0800 "randyhyde(a)earthlink.net" <randyhyde(a)earthlink.net> wrote: :But you will have to maintain separate x86 assembly and PPC (or :whatever) assembly files. That's the whole point here. Since this is the whole point, lets see if we can get it settled, once and for all. Given a body of x86 assembly source to convert, either it is the intent to continue to maintain that source, or else to freeze it. If the intent is to freeze the x86 code, then -- tool or no tool -- there is only one source to maintain. If the intent is to continue to maintain the x86 source, and the tool is NOT used, then one will have to maintain separate x86 and <whatever> assembly files. However it the tool IS used, then the worst case is that one has to maintain two sources, but there is a possibility that only the x86 source must be maintained, because the <whatever> source can be regenerated every time the x86 source is updated. So, it is clear that the use of the tool is NEVER worse (in this regard) than not using the tool, and can sometimes be better. -- Chuck
From: randyhyde@earthlink.net on 8 Feb 2006 00:14
Charles A. Crayne wrote: > > If the intent is to continue to maintain the x86 source, and the tool is > NOT used, then one will have to maintain separate x86 and <whatever> > assembly files. However it the tool IS used, then the worst case is that > one has to maintain two sources, but there is a possibility that only > the x86 source must be maintained, because the <whatever> source can be > regenerated every time the x86 source is updated. > > So, it is clear that the use of the tool is NEVER worse (in this regard) > than not using the tool, and can sometimes be better. Maybe it's just me. But I would find *manually rewritten* code a *heck* of a lot easier to maintain than the kind of stuff that PortASM/86 is putting out. I suspect you don't know PPC assembly language, else you would recognize that the code it is producing is *very bad* and not at all written in the RISC/PPC paradigm. And therein lies our difference of opinion, I suspect. If the code that PortASM/86 produced were actually readable and followed standard PPC (or whatever) programming style, I might agree with you for the *occasional* project that requires translation. But, alas, the code is *far* worse than the stuff I've seen *any* compiler produce. So even if your assembly code was written in a "good" manner that made translation easy (and I'd suggest that this is a stretch), and the conversion was almost 100% automatic, you'd still have the problem of dealing with code that has expanded by a factor of three or more. The problem with *every* PortASM/86 sequence I've seen to date is that it does an instruction by instruction conversion. This means that it winds up converting each x86 instruction to a sequence of PPC instruction that attempt to do the same thing (typically three or more instructions). In fact, a good PPC assembly programmer will not do this. While a RISC assembly program *is* going to be larger than a CISC assembly program, the difference isn't as great as we're seeing in the PortASM/86 code. Now the truth is, a semi-automatic conversion *could* be better than what PortASM/86 is doing. If they did a decent data and control flow analysis of the x86 program, rather than trying to simulate the x86 code (e.g., renaming registers RAX, RBX, etc.) and they kept data in registers rather than going to memory every time the x86 references a memory location, they'd generate *much* better code. But it should be clear that the MicroAPL folks have put a *lot* of effort into this product and they've only achieved as much as they have. Again, I would argue, that it's *less work* to simply hand port the few apps you need moved to a different CPU rather than go to all the hassle of writing a *decent* translator and then hand massaging the output. Particularly if you want to translate to more than one CPU. But feel free to use the MicroAPL code to port your adventure game to the PPC (or some other processor). It would be interesting to see how much work is *really* required to pull this off (I'm assuming you wrote it with MASM, back in the DOS days). Nothing like an actual project to pull it off. I'd say you could then run the result on a Macintosh, but I'm not sure if the MicroAPL stuff emits code that is compatible with MacOS' memory layout (certainly the examples I've seen are not, but there may be an option to allow this). In any case, if you can truthfully demonstrate that it would only take you about 20% of the effort to do the conversion, then you will convince me. And as (I assume) you've frozen the x86 code, we don't have to worry about future maintenance, right? Cheers, Randy Hyde |