Prev: Check out POASM
Next: Bad habits
From: randyhyde@earthlink.net on 24 Jan 2006 17:54 Alex McDonald wrote: > > > > No, it looks like a *compiler* rather than an *interpreter* (emulator). > > And that means it will run about 2-10x faster than an emulator. > > OK, I see your point. However, there are some tasks that even a > compiler can't handle here. This; > > ADD EAX, [EBX+10] > > (in whatever x86 notation takes your fancy) would be better emulated on > a VM on the target processor than make the attempt to translate each > and every statement into a set of equivalent opcodes on the target. For > instance, using memory instead of registers, or emulating the stack > runs the problem of state; what state have I left at the last > instruction I generated code for? The above code leaves states of > overflow, zero, carry to name but a few. Now stick a label on it, and > jump to it from somewhere else. Because we need to retain states at run > time (we can't know them all at compile time), we're emulating, not > compiling. This is where the optimizer comes in. The optimizer does a data flow analysis of the code to determine which of these flags will actually get used (or which flags it cannot determine do *not* get used) along some future control path, and then emits code to maintain *only* the necessary flags. Yep, it can be expensive on certain processors that have no concept of a carry or overflow flag. But you're going to have the same problem with emulation/interpretation. > > It's worse with statements like > > CALL $+5 > POP EAX > > That leaves the address of the POP in EAX. But there are architectures > where the IP must be on a 4byte boundary and where branch delay slots > are required to change the IP; like the MIPS. I'd like to see > line-by-line compilation of the equivalent into MIPS; it's a serious > challenge. Sure it's a serious challenge. But you're talking about optimization issues. That is, let's translate the *semantics* of the original code to get the best possible MIPS code. Simply translating the x86 code to something that will execute, albeit slowly, on the MIPS isn't the problem. The problem is the *slow* code you'll get if you do a trivial translation of the above to something that will work, though not efficiently, on the MIPS. Again, the trivial stuff even Betov could do. But the code would run so slowly on something like the MIPS (a popular Pocket PC processor), particularly at the less than GHz speeds of MIPS processors used in Pocket PCs, that no one would use the translator. The real work is developing a decent optimizer. Apple (68K->PPC and Rosetta), Intel (IA64), and Transmeta have experience with this. Rene doesn't even understand the problems. > There is a possible one opcode alternative, as the MIPS has > a single instruction (jal) that leaves the return address /plus eight/ > (after the branch delay slot) in a register. But that needs lookahead > to identify that the target of the CALL is a POP; now it's a compiler > again, rather than a line by line assembler. And what if the POP was > reached from a CALL [EBX] miles away in the code? This is seriously > hard stuff, even for a very smart compiler, and territory best handled > by an emulator. Well, it's most certainly handled *easiest* (from the programmer's point of view) by an interpreter. But *good* optimization strategies *do* exist. Though the MIPS is much cruder than the PPC, the PPC has similar issues. Ugly code because opcodes are limited to 32 bits. But the trick is to simply use more instructions and figure out how to optimize them so that "more" doesn't turn into "a whole lot more". Keeping in mind that many modern RISC processors actually run slower, in general, than an x86, we don't really want to lose a whole lot of performance when translating from the x86 to something else. > > > > > > and to be quite honest, it would be easier > > > to write one than your mythical "encoder". > > > > Certainly writing an interpreter is far easier than writing a compiler. > > But the interpreter is *much* slower. > > And Betov's x86 source/MIPS backend assembler (as an example, and if > possible) would be as slow, if not slower still. You're assuming it even works. Given the state of his disassembler, which he claims is "finished", I'd give it about a 1% chance of working reasonably well. (At least the problem of cross compiling from one machine code to another is a possible and tractible, though conceptually difficult, problem :-) ). > > > > In simple terms, he is describing an over-glorified macro processor > > that translates each x86 assembly instruction into some comparable > > sequence on a CPU found on a pocket PC or similar system. Of course, > > the concept of optimization has never occurred to him, so the code he > > would generate would be absolutely terrible. > > I would contend; not possible. There are too many variables to handle > with states and side effects. Thats why compiled languages deliberately > remove them. Difficult and inefficient, but certainly possible. I mean, it's quite easy to come up with a MIPS instruction sequence that faithfully reproduces the semantics of each x86 machine instruction. The only problem is that those sequences are quite long and you wouldn't want to emit them for each and every x86 instruction. Then again, maintaining such state probably wouldn't even occur to Rene until after about three years into the project, at which point he'd just announce that the project is "complete" and does a "two click assembly to MIPS (or whatever) code" and "Gee, isn't RosAsm great because it does this?" The fact that it wouldn't really be working except for some trivial demo cases wouldn't mean much to him. > > > > > Rene reminds me of this Canadian guy, Roedy Green, from the 1990s > > (though Roedy was a heck of a lot nicer). He used to be a *big* > > supporter of assembly language. Everything was to be written in > > assembly. One of the biggest supporters of assembly at the time. Then, > > one day, he switched to Forth because of assembly's "limitations". > > Someday, I expect the same sort of thing from Rene (probably when the > > ReactOS team calls it quits). > > Please, let it be not Forth. I am an amateur Forth programmer, a keen > advocate and maintainer of a Forth compiler (part public domain, part > GPL). I couldn't bear the thought of Rene lecturing the "sub sh*ts" on > the "one true way" over on comp.lang.forth. Or of wannabee following > his prophet and dribbling incoherent replies to every post there. :-) Cheers, Randy Hyde
From: ?a/b on 25 Jan 2006 04:46 On Tue, 24 Jan 2006 08:05:47 GMT, "\\\\\\o///annabee" <fack(a)szmyggenpv.com> wrote: >P? Tue, 24 Jan 2006 01:58:16 -0500, skrev Frank Kotler ><fbkotler(a)comcast.net>: > >> Funny no one has done it. Perhaps they lack Betov's "vision"... > >Anyway. Why do you think it could not be done? it is very easy, it is possible if in each cpu there is a common "minimal cpu". example: there are 1) 7 address for 7 8-bit registers 2) 7 address for 7 16-bit registers 3) 7 address for 7 32-bit registers 4) principal logical operators for 8-16-32 address registers: and, not, or etc 5) all instructions for jumps use the data in 8-16-32 registers 6) a stack and the register 7 that point to it *in each cpu*: so it is an "hardware port" :) so it could be in the future, not now and should be agree all ones that build cpu. and for portable use we have to use the subset of the language of each cpu (that use 1-2-3-4-5-6) "the minimal cpu". all the probelms are define "the minimal cpu" that is easy to program
From: Herbert Kleebauer on 25 Jan 2006 05:29 "?a\\/b" wrote: > On Tue, 24 Jan 2006 08:05:47 GMT, "\\\\\\o///annabee" > ><fbkotler(a)comcast.net>: > >> Funny no one has done it. Perhaps they lack Betov's "vision"... > > > >Anyway. Why do you think it could not be done? > > it is very easy, > it is possible if in each cpu there is a common "minimal cpu". > *in each cpu*: so it is an "hardware port" :) so it could be in the > future, not now and should be agree all ones that build cpu. We already have this CPU: http://java.sun.com/docs/books/vmspec/2nd-edition/html/Overview.doc.html#7143 There are assemblers for this CPU and an emulation which runs on nearly any existing processor. There are also (just-in-time) compilers which translate the instructions to native code for other CPU's. And there is also a big library which allows you to write OS independent code.
From: randyhyde@earthlink.net on 25 Jan 2006 17:37 ¬a\/b wrote: > On Tue, 24 Jan 2006 08:05:47 GMT, "\\\\\\o///annabee" > <fack(a)szmyggenpv.com> wrote: > > >På Tue, 24 Jan 2006 01:58:16 -0500, skrev Frank Kotler > ><fbkotler(a)comcast.net>: > > > >> Funny no one has done it. Perhaps they lack Betov's "vision"... > > > >Anyway. Why do you think it could not be done? > > it is very easy, It is very easy if you define a minimal CPU and only do the conversion from that CPU to all the others. However, Rene is talking about converting x86 assembly language code to machine code for other processors. This is a *very* difficult problem. Consider the following code sequence, for example: mov eax, someCodePtr add eax, 4 jmp eax Without question, you can translate these instruction, one for one, to many other processors. But will the result behave the same as the original x86 code? Doubtful. This code sequence skips four bytes of opcodes at the address specified by someCodePtr. Alas, when you translate the x86 code at the address specified by someCodePtr to the target processor, it's unlikely that the code at that address is a four-byte instruction that can be easily skipped by the code above. This is but *one* of the serious problems associated with doing the compilation. As Alex says, emulation/interpretation is *much* easier. Compilation is *very* difficult as AFAIK it's still an open research problem (working from source code, at least; working from binary is an undecideable problem for the same reason disassembly is undecideable). > it is possible if in each cpu there is a common "minimal cpu". > > example: there are > > 1) 7 address for 7 8-bit registers > 2) 7 address for 7 16-bit registers > 3) 7 address for 7 32-bit registers > 4) principal logical operators for 8-16-32 address registers: and, > not, or etc > 5) all instructions for jumps use the data in 8-16-32 registers > 6) a stack and the register 7 that point to it > > *in each cpu*: so it is an "hardware port" :) so it could be in the > future, not now and should be agree all ones that build cpu. > > and for portable use we have to use the subset of the language > of each cpu (that use 1-2-3-4-5-6) "the minimal cpu". > all the probelms are define "the minimal cpu" that is easy to program Yes, you can build a trivial processor whose semantics can be expressed in just about every other assembly language. That's not what this thread is about. Cheers, Randy Hyde
From: Charles A. Crayne on 26 Jan 2006 01:20
On 25 Jan 2006 14:37:39 -0800 "randyhyde(a)earthlink.net" <randyhyde(a)earthlink.net> wrote: :However, Rene is talking about :converting x86 assembly language code to machine code for other :processors. Perhaps, or perhaps he is talking about converting x86 assembly language code to source code for other processors. In either case, his approach is probably the one which requires the least human interaction to accomplish the above goal. However, since the difficulty of the task depends upon the similarity, or lack thereof, between the source and target architectures, and since there has not yet been any agreement on what the target architecture might be, it is easy, albeit unproductive, to postulate theoretical difficulties which may not be a significant consideration in real world implementations. :mov eax, someCodePtr :add eax, 4 :jmp eax Leaving aside the fact that this pseudo-example is bad coding practice, and may never occur in the source which Betov proposes to migrate, it does illustrate a more general issue, which needs to be considered. For obvious reasons, labels in the x86 source are highly unlikely to resolve to the same addresses as the corresponding labels do in the target source. Therefore, if one is going to even approximate line by line translation, ALL target addresses must be symbols, so that they can be resolved by the target assembler. If, for example, the address size of the target architecture is not four bytes, then a jump table invocation such as, 'jmp [sometable+4*eax]' requires that both the code statement, and the elements of the table be altered. Some of these special cases can be handled automatically by the tool, and others will have to be cleaned up by a human. However, I have yet to see any arguments which reasonably suggest that the proposed tool would not be a useful one. -- Chuck |