Prev: Code Guidelines
Next: JavaScript code mangler / scrambler / ... khm, more than obfuscator... :)
From: Scott Sauyet on 28 Dec 2009 10:48 On Dec 27, 2:00 pm, Andrew <anz...(a)gmail.com> wrote: > On Dec 23, 7:45 pm, Scott Sauyet <scott.sau...(a)gmail.com> wrote: >> [ ... ] > I am beginning to think that the example was not such a good idea... > yes it's flawed. It is an EXAMPLE, not proof of concept. You can do > better if you wish. :) But the point is that this was an attempt at a simple example. If we can't get a simple case to work, how likely is it that the more complicated ones will be workable? Let's take it simpler still. How about if our mangler had a rule to turn this: function f(x) { return x + 1; } into this: function f_prime(x) { var y = x; return y + 1; } Except noting possible boundary issues, it's pretty clear that we've maintained the correspondence between f and f_prime. But now what if we tried it on this: function g(x) { eval("var y = y ? y + x : x") return x + 1 } to get this: function g_prime(x) { var y = x; eval("var y = y ? y + x : x") return y + 1 } Then g(20) = 21, but g_prime(20) = 41. And of course we could rig it so that the "y" is not easily visible inside the eval statement. The point is that this really is difficult. > - of course it is possible to do it, at least to some extent (still > not buying the Rice's theorem having to do anything with this - as > Lasse pointed out, compilers do that all the time) I'm certainly not buying Thomas Lahn's assertion that Rice's theorm implies that | [Y]ou cannot devise a general function that computes another function | that computes the same result for the same input (a trivial property) | as a third function. I think the Y-Combinator [1] demonstrates that this is false. But I also think Lasse Nielsen is overstating when he says that this is so similar to what a compiler does. A compiler is doing something akin to transcription. I think your mangler would have to do something closer to translation. While I have read Hofstadter [2] and understand something about how difficult it might be to draw the line between the two, I think it makes a difference in how easy it would be to create an algorithm for one versus the other. >> In other words, while I think this is an interesting abstract >> discussion, I don't think there are any real chances of building a >> tool that does what you want. > > Well, that might be true. But not because of all the things others > have written in these posts, but because JS is such a difficult > language for this task. I am no way an expert in JS, but from what I > have seen (closures come to mind) there are some concepts that make it > really hard to do this kind of thing. Not impossible, but difficult. Actually, I think closures would be one of the best tools to use to accomplish some obfuscation. function f_prime2 = (function(test) { var y = test ? 0 : 1; return function(x) { return x + y; } })(false); This is already fairly obfuscated from the original function, and I can clearly imagine additional layers. But I don't see that as likely to hide much from even a mildly persistent snoop. I think the worst problem would come with "eval" and its cousins. > Not sure it would be a bad business idea though - seeing how > obfuscators sell for $60-$250 when they are not even remotely > successful. If someone is capable of doing it, of course. I would have > bought a copy right now, but of course, it would have to be so good > that a capable programmer would have difficulties deducting what the > code does. I think the best assessment in this thread was in Lasse Nielsen's first response where he discussed the factors to be considered in approaching this type of tool and concluding: | I personally think any informed evaluation of that kind would | end in rejecting the idea. It's not that something couldn't be built. Clearly someone could build a tool that would accomplish much of what you suggest. But as the inputs got more complicated, it's more and more likely that the tool would actually break working code, and of course it would be by design much more difficult to debug the issues that arose. -- Scott [1] http://en.wikipedia.org/wiki/Fixed_point_combinator [2] http://en.wikipedia.org/wiki/Metamagical_Themas
From: Thomas 'PointedEars' Lahn on 28 Dec 2009 18:49 [Cancel & Supersedes] Scott Sauyet wrote: > Andrew wrote: >> - of course it is possible to do it, at least to some extent (still >> not buying the Rice's theorem having to do anything with this - as >> Lasse pointed out, compilers do that all the time) > > I'm certainly not buying Thomas Lahn's assertion that Rice's theorm > implies that > > | [Y]ou cannot devise a general function that computes another > | function that computes the same result for the same input > | (a trivial property) as a third function. > > I think the Y-Combinator [1] demonstrates that this is false. I do not think the Y-Combinator applies here. Especially, we can read where "Y-combinator" redirects to: ,-<http://en.wikipedia.org/wiki/Fixed_point_combinator> | | A fixed point combinator (or fixed-point operator) is a higher-order | function that computes a fixed point of other functions. A fixed point | of a function f is a value x such that f(x) = x. For example, 0 and 1 | are fixed points of the function f(x) = x2, because 02 = 0 and 12 = 1. | Whereas a fixed-point of a first-order function (a function on "simple" | values such as integers) is a first-order value, a fixed point of a | higher-order function f is another function p such that f(p) = p. | A fixed point combinator, then, is a function g which produces such a | fixed point p for any function f: | | p = g(f), f(p) = p | | or, alternately: | | f(g(f)) = g(f). | | Because fixed-point combinators are higher-order functions, their history | is intimately related to the development of lambda calculus. One well | known fixed point combinator in the untyped lambda calculus is Haskell | Curry's Y = ?f�(?x�f (x x)) (?x�f (x x)). The name of this combinator is ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | incorrectly used sometimes to refer to any fixed point combinator. The ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | untyped lambda calculus however, contains an infinity of fixed point | combinators. Fixed point combinators do not necessarily exist in more ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | restrictive models of computation. For instance, they do not exist in ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | simply typed lambda calculus. As for Rice's theorem: <http://en.wikipedia.org/wiki/Rice%27s_theorem#Proof_by_reduction_from_the_halting_problem> | Proof by reduction from the halting problem | --------------------------------------------------------------------- | | Proof sketch | | Suppose, for concreteness, that we have an algorithm for examining | a program p and determining infallibly whether p is an implementation | of the squaring function, which takes an integer d and returns d^2. | The proof works just as well if we have an algorithm for deciding any | other nontrivial property of programs, and will be given in general | below. | | The claim is that we can convert our algorithm for identifying | squaring programs into one which identifies functions that halt. | We will describe an algorithm which takes inputs a and i and | determines whether program a halts when given input i. | | The algorithm is simple: we construct a new program t which (1) | temporarily ignores its input while it tries to execute program a on | input i, and then, if that halts, (2) returns the square of its input. | Clearly, t is a function for computing squares if and only if step (1) | halts. Since we've assumed that we can infallibly identify program for | computing squares, we can determine whether t is such a program, and | therefore whether program a halts on input i. Note that we needn't | ctually execute t; we need only decide whether it is a squaring program, | and, by hypothesis, we know how to do this. | | t(n) { | a(i) | return n�n | } | | This method doesn't depend specifically on being able to recognize | functions that compute squares; as long as some program can do what | we're trying to recognize, we can add a call to a to obtain our t. | We could have had a method for recognizing programs for computing | square roots, or programs for computing the monthly payroll, or | programs that halt when given the input "Abraxas", or programs that | commit array bounds errors; in each case, we would be able to solve | the halting problem similarly. | | [Following the proof that this cannot be done, because if it were | doable that would mean the halting problem was decidable, which | it is not.] Now, in order to write a program `P' that obfuscates code in a way that statements are replaced by more complex code (e.g., function calls) that computes the same as the original code for any given input, that program `P' must be able to recognize, for example, a set of statements that is an implementation of the squaring function as such a program `p'. Since the proof above shows that this is not possible, there cannot be such a program `P' (even if there can be such programs `p' or `t', which was not disputed before). A corollary of this is what I originally stated: If you cannot programmatically determine if a set of statements is an implementation of a certain function `p', you can _not_ write a program `P' that computes or selects another function `t' that computes the same values as this set of statements for the same inputs. I find it rather curious that nobody appears to have recognized that even though the proof for Rice's theorem in the Wikipedia is based on the very same trivial property that the OP used in their question (the squaring function), and I pointed that out in my answer. PointedEars -- realism: HTML 4.01 Strict evangelism: XHTML 1.0 Strict madness: XHTML 1.1 as application/xhtml+xml -- Bjoern Hoehrmann
From: Scott Sauyet on 29 Dec 2009 10:50 On Dec 28, 6:49 pm, Thomas 'PointedEars' Lahn <PointedE...(a)web.de> wrote: > Scott Sauyet wrote: >> Andrew wrote: >>> - of course it is possible to do it, at least to some extent (still >>> not buying the Rice's theorem having to do anything with this - as >>> Lasse pointed out, compilers do that all the time) > >> I'm certainly not buying Thomas Lahn's assertion that Rice's theorm >> implies that > >> | [Y]ou cannot devise a general function that computes another >> | function that computes the same result for the same input >> | (a trivial property) as a third function. > >> I think the Y-Combinator [1] demonstrates that this is false. > > I do not think the Y-Combinator applies here. Especially, we can read > where "Y-combinator" redirects to: > > ,-<http://en.wikipedia.org/wiki/Fixed_point_combinator> Which, incidentally, is the exact link I supplied. > | [ ... ] The name of this combinator is > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > | incorrectly used sometimes to refer to any fixed point combinator. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ However, I did not misuse the term. I used the Y-combinator as the best-known example of a fixed-point combinator. Since it was a demonstration of existence, one example is plenty. > | [ ... ] Fixed point combinators do not necessarily exist in more > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > | restrictive models of computation. For instance, they do not exist in > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > | simply typed lambda calculus. But it does exist in Javascript. A quick web search turns up numerous examples. This is my preferred form: var Y = function(f) { return (function(g) { return g(g); })(function(h) { return function() { return f(h(h)).apply(null, arguments); }; }); }; Can you find a function f and a value x such that the following do not yield the same result: f(x) Y(function() {return f;})(x) ? > As for Rice's theorem: > <http://en.wikipedia.org/wiki/Rice%27s_theorem#Proof_by_reduction_from...> > | Proof by reduction from the halting problem Yes, we all know how to read Wikipedia... > Now, in order to write a program `P' that obfuscates code in a way that > statements are replaced by more complex code (e.g., function calls) that > computes the same as the original code for any given input, that program > `P' must be able to recognize, for example, a set of statements that is an > implementation of the squaring function as such a program `p'. No, it doesn't have to do that. I could replace every function in the original program with another that computes the same result, but which is syntactically more complicated. This does not go very far towards the OP's goal, though, as I'm sure it would be straightforward to write code to recognize and reverse this transformation. > Since the > proof above shows that this is not possible, there cannot be such a program > `P' (even if there can be such programs `p' or `t', which was not disputed > before). Sorry, I don't understand the parenthetical remark. But your conclusion is clearly based upon a misreading of Rice's Theorem. > A corollary of this is what I originally stated: If you cannot > programmatically determine if a set of statements is an implementation of a > certain function `p', you can _not_ write a program `P' that computes or > selects another function `t' that computes the same values as this set of > statements for the same inputs. But you don't necessarily have to come at this from that direction. You don't have to demonstrate an understanding of any set of statements. You can do these transformations at the syntactic level, using, for example, a fixed-point combinator like the Y-combinator on the functions presented. I'm not trying to support the OP. I think the idea is not well- conceived, and that even if it were to work reasonably well, it would cause more problems than it could possibly solve, but Rice's theorem is not relevant. You didn't respond to Lasse Nielsen's last post, but, if you're right that Rice's theorem says that you cannot programatically convert a function to another of identical behavior, how do explain the existence of compilers? > I find it rather curious that nobody appears to have recognized that even > though the proof for Rice's theorem in the Wikipedia is based on the very > same trivial property that the OP used in their question (the squaring > function), and I pointed that out in my answer. That means only that we can't algorithmically recognize any and all sets of statements that represent squaring functions. It doesn't imply that we can't transform one squaring function into another. Do you think the following functions are equally transparent?: var f1 = function(s) {return x * x;} var f2 = (function(f) { return (function(g) { return g(g); })(function(h) { return function() { return f(h(h)).apply(null, arguments); }; }); })(function() { return function(x) {return x * x;} }); The second is simply the result of the Y-combinator applied to the first. We're pretty far afield from the OP's goals, though. The thread has given a pretty strong consensus that those goals are at best very difficult and, even if attainable, are not worth pursuing. -- Scott
From: Scott Sauyet on 29 Dec 2009 12:08 On Dec 29, 10:50 am, I <scott.sau...(a)gmail.com> wrote: > Do you think the following functions are equally transparent?: > > var f1 = function(s) {return x * x;} ^^^ Sorry, typo: var f1 = function(x) {return x * x;} -- Scott
From: Thomas 'PointedEars' Lahn on 29 Dec 2009 12:46
Scott Sauyet wrote: > Thomas 'PointedEars' Lahn wrote: >> As for Rice's theorem: >> <http://en.wikipedia.org/wiki/Rice%27s_theorem#Proof_by_reduction_from...> >> | Proof by reduction from the halting problem > > Yes, we all know how to read Wikipedia... Apparently not. > You didn't respond to Lasse Nielsen's last post, but, if you're right > that Rice's theorem says that you cannot programatically convert a > function to another of identical behavior, how do explain the > existence of compilers? A compiler transforms source code into machine-dependent code; IOW, it translates one higher-level language into another, lower-level language. That is a concept fundamentally different from that which Rice's theorem addresses. >> I find it rather curious that nobody appears to have recognized that >> even though the proof for Rice's theorem in the Wikipedia is based on >> the very same trivial property that the OP used in their question (the >> squaring function), and I pointed that out in my answer. > > That means only that we can't algorithmically recognize any and all > sets of statements that represent squaring functions. It means that no *program* can do that. > It doesn't imply that we can't transform one squaring function into > another. It does imply that a *program* cannot do that. > Do you think the following functions are equally transparent?: > > var f1 = function(s) {return x * x;} > > var f2 = (function(f) { > return (function(g) { > return g(g); > })(function(h) { > return function() { > return f(h(h)).apply(null, arguments); > }; > }); > })(function() { > return function(x) {return x * x;} > }); > > The second is simply the result of the Y-combinator applied to the > first. What you are still missing is that the "Y-combinator" was applied by *you* here, an (arguably) intelligent being applying informed reasoning; _not_ a program implementing an algorithm. For a simple implementation of the example used in the proof for Rice's theorem, a program implementing an algorithm would need to recognize whether f() in function a(j) { while (--j) b(j); } function f(x) { a(i); return x * x; } was an implementation of the squaring function or not in order to compute or select a suitable replacement for obfuscation. It cannot do that because it depends on the pre-call value of `i' (and the behavior of b(j)), whether a(i) halts, and if it halts, if execution even reaches the `return' statement in f(), *regardless* that it is there. And the program cannot recognize whether a(i), i.e. the while-loop in it, halts, let alone whether, if it halts, the `return' statement in f() is ever reached. > We're pretty far afield from the OP's goals, though. Obfuscation was part of the goal, and that has been attained to a certain degree. However, it has necessarily _not_ been attained *through a program*, of course, and so missing the other part, of course. > The thread has given a pretty strong consensus that those goals are at > best very difficult and, even if attainable, are not worth pursuing. ACK PointedEars -- var bugRiddenCrashPronePieceOfJunk = ( navigator.userAgent.indexOf('MSIE 5') != -1 && navigator.userAgent.indexOf('Mac') != -1 ) // Plone, register_function.js:16 |