From: Paul Dunn on 3 Oct 2005 16:42 Guga wrote: > (Not sure if this is what you consider "halt")...I mean, if "halt" > means hang, break, crash....then the program is not halting. It is > merelly going on a loop. As far as I can tell (and I'm no computer-science graduate) the "halting problem" is something like this: 1. Retrieve input 2. perform calculation or algorithm on that input 3. Do something else This has a problem - will the calculation in (2) ever return? If it does, then the code can be disassembled easily. If not, then you don't stand much chance of predicting if it will return, unless the algorithm/calculation is trivial. If it gets even remotely complex, it becomes a very hard thing to figure out. So, by "emulating" the process, you cannot say that you will ever get to discover if the code in (3) is ever executed. This is, as far as I can tell, the "halting problem". Many people have said before that the best way to discover if (3) is code or data is to follow the code. Of course, you may end up in a loop in (2), and you have to follow that loop until it exits. But how many iterations of that loop must you follow? It could end up in the billions of loops, it may be just one or two. Where do you put the "cut off" point? What if the loop is infinite (halted)? How can you tell? If the loop requires some user-input to exit, then without the precise input given to your "analyser" then you're not going to exit. And neither is your analyser - it will loop round and round forever, and will "halt" (never return) by itself. And if it does that, how can you be certain that the bytes at (4) are code, or data? I'm having the same problem in my BASIC interpreter when a user executes some machine code. How do I determine when that code will return to BASIC? It's a famous problem - yes, you can tell with reasonable certainty that a given section of code will or will not loop. But you can't tell for *all* code. That's the way I see it, any rate.
From: Guga on 3 Oct 2005 17:51 "If the loop requires some user-input to exit, then without the precise input given to your "analyser" then you're not going to exit. And neither is your analyser - it will loop round and round forever, and will "halt" (never return) by itself. And if it does that, how can you be certain that the bytes at (4) are code, or data? " Ok, but.....If the program needs an input to get out the loop, it means that this is not a endless loop´anyway. The specied routine where the loop is happening, is of course happening a loop, but, if the exti depends of some other commdn to get out, then the ending point of that loop can be determined. For example, if you have this: mov eax push Data001 L0: cmp eax 01 | jne @ExitLoop jmp L0< @ExitLoop: xor eax eax mov eax 01 If we have a condition inside the loop or in anyother part of the code that will result in access the Offset at @ExitLoop, then it is not an endless loop that can´t be retrieved. I mean, the data below @ExitLoop will already be referenced somewhere else in the code (IN case of the example, with the cmp eax...but it can be several lines before the loop happens as well. It doesn´t matter, since the Data After @ExitLoop is being referenced in other part of the program So, on this situation, if you have a reference there, then the disassembler must properly identify that routines, regardless the existance of the loop. I was talking about no references at all after @ExitLoop. Also, if you don´t consider an internal conditional jmp or any internal pointer to that offset, and let´s say that offset is being pointed by another app (In case if our example, is an dll), then the @ExitLoop have an external reference defined in the header. Or if the code is accessed on the same way as the classes in DX, MFC etc, it should have a Table on the main app (The dll) displaying the Offsets to be targeted. (At least as far as i can remember...there is an reference to the pointers to that label anywhere in the file). But, if we have an app (Dll for example) with those unreferenced offsets that are accessed only by an external app, then the unreferecend data of that dll (per se) are nothing more then a bunch of DB _untill_ they are accessed by an external file that uses them. Then after being accessed (or during the access) they are interpreted as code or data because they are afterall being used. "it will loop round and round forever, and will "halt" (never return) by itself. And if it does that, how can you be certain that the bytes at (4) are code, or data? " Exactly, but if you can't tell when it will stop because the data is unacessed, neither the processor can tell, untill the data is accessed (externally on your example) One way _maybe_ is you keep track of the external programs that where reponsable to access that specific code externally. I don´t know..it is just a conjecture (Don´t know if this is the proper term in english...i mean, a "idea", a "thought"). Best Regards, Guga
From: Charles A. Crayne on 4 Oct 2005 01:04 On 3 Oct 2005 08:48:29 -0700 "randyhyde(a)earthlink.net" <randyhyde(a)earthlink.net> wrote: :I think that Chuck Crayne has confused a lot of people by attempting to :claim that these theorems only apply to infinite machines. Although I do believe the above claim, I am not the originator of it -- I am merely reporting the claims made by the reference which you provided: "The undecidability of the halting problem relies on the fact that algorithms are assumed to have potentially infinite storage: at any one time they can only store finitely many things, but they can always store more and they never run out of memory. However, computers that actually exist are not equivalent to a Turing machine but instead to a linear bounded automaton, as their memory and external storage of a machine is limited. In this case, the halting problem for programs running on that machine can be solved with a very simple general algorithm" -- Chuck
From: Charles A. Crayne on 4 Oct 2005 01:07 On 3 Oct 2005 09:19:27 -0700 "Guga" <mauroteste(a)hotmail.com> wrote: :I mean, if the processor itself have no problems identifying this, why :an disassembler (Assuming it to reproduce the same steps as what the :processor should do) is not possible ? On any given run of a program, the processor may execute only a few of the possible paths through the code. The disassembler, on the other hand, must analyze all possible paths. In some cases, this is easy, but in a few, extreme cases, it may be very difficult. -- Chuck
From: Paul Dunn on 4 Oct 2005 09:24
pH wrote: > On Mon, 03 Oct 2005 20:42:47 GMT, "Paul Dunn" > <paul.dunn4(a)ntlworld.com> wrote: [Snippage] >> It's a famous problem - yes, you can tell with reasonable certainty >> that a given section of code will or will not loop. But you can't >> tell for *all* code. >> >> That's the way I see it, any rate. > > Congratulations! Perhaps you could offer a hand at explaining to > the... uhh... "slower" people here? <g> You mean I actually got that right? Bloody hell, there's hope for me yet :-p |