From: Joshua Maurice on 9 May 2010 20:05 On May 9, 9:30 am, "Mike Schilling" <mscottschill...(a)hotmail.com> wrote: > Joshua Maurice wrote: > > It is technically impossible to do a true minimal recompile > > algorithmically. Let's define it as "Let a build be a set of file > > compilations. Let the minimum recompile be the minimum such set for > > which the output class file are equivalent to the class files of a > > full clean build." First, we'd have to prove such a minimum exists. > > That's relatively straightforward. With that out of the way, I think I > > could then prove that the problem is equivalent to the Halting > > problem. If you define "equivalent" generously, I'm pretty sure this > > is the case. If you define it as "same binary file content", then > > perhaps not, though still possibly yes. > > Why would you think that? The ways in which a change to A can affect B is > finite and well-defined. Let's suppose the source changed from "2" to "(1+1)". Using the strict interpretation, the output class files would probably be equivalent (ignoring debug information like original source code). Thus a rebuild is technically not required. Or, a slightly less trivial case: Suppose we have class A which has 2 public static functions A1 and A2 which have independent implementations. Class C uses A1. Suppose someone comes along and changes A2 in some meaningful way. Class C does not need to be recompiled, but any sort of file level dependency analysis which is correct would recompile Class C. If you define it as binary file contents as equivalent output files, then it's not equivalent to the Halting problem, I think. However, if you define it as "The output class files display the same visible behavior across all valid input", then I think the general case \is\ the Halting problem. You would have to prove for all allowed inputs to the built program that the behavior is the same with the "minimal" rebuild and with a full clean build, and that there is no smaller rebuild which would display the same behavior. If you instead simplify it to "If a file is touched, then all direct dependencies and ghost dependencies should be recompiled, and repeat until the leaves of the cascade are unchanged", then I think this is quite doable. However, someone did mention "But what if you change just a comment?". I can extend this too "but what if you just changed a '2' to '1+1'?", which can probably be extended further. Hopefully you see my point.
From: Joshua Maurice on 9 May 2010 20:06 On May 9, 4:42 pm, EJP <esmond.not.p...(a)not.bigpond.com> wrote: > On 10/05/2010 1:26 AM, Tom Anderson wrote: > > Doesn't/didn't 'jikes' do all this? As far as I can tell, Jikes is no longer in development or supported. Also, I have doubts as to its correctness. Also, my company would probably feel better if the compiler in use was Sun' javac and not Jikes.
From: Joshua Maurice on 9 May 2010 20:11 On May 9, 8:26 am, Tom Anderson <t...(a)urchin.earth.li> wrote: > On Sun, 9 May 2010, Joshua Maurice wrote: > > I am still in the process of implementing, so I haven't really been able > > to test, > > You're doing it wrong. Test first, test incrementally, build things in a > way you can test as you go. Before you start work for real, do a > higher-level test, a 'spike solution', to make sure that all your > assumptions (like this one) are valid. I am. It's just that implementing this class file parser, ghost dependency, build system, etc., from scratch will take some time. I almost have it working on a toy example, at which point I can throw what little tests I have against it.
From: Mike Schilling on 9 May 2010 20:21 EJP wrote: > Doesn't/didn't 'jikes' do all this? It did something of the sort (I don't know how thoroughly), but it's very obsolete, never having supported the language features that were new in JDK 1.5. But having thought about this a bit, I think that the onlt feasible way to support anything like minimal recompilation is by having the compiler calculate and persist dependency information.
From: Mike Schilling on 9 May 2010 20:41
Joshua Maurice wrote: > On May 9, 9:30 am, "Mike Schilling" <mscottschill...(a)hotmail.com> > wrote: >> Joshua Maurice wrote: >>> It is technically impossible to do a true minimal recompile >>> algorithmically. Let's define it as "Let a build be a set of file >>> compilations. Let the minimum recompile be the minimum such set for >>> which the output class file are equivalent to the class files of a >>> full clean build." First, we'd have to prove such a minimum exists. >>> That's relatively straightforward. With that out of the way, I >>> think I could then prove that the problem is equivalent to the >>> Halting problem. If you define "equivalent" generously, I'm pretty >>> sure this is the case. If you define it as "same binary file >>> content", then perhaps not, though still possibly yes. >> >> Why would you think that? The ways in which a change to A can affect >> B is finite and well-defined. > > Let's suppose the source changed from "2" to "(1+1)". Using the strict > interpretation, the output class files would probably be equivalent > (ignoring debug information like original source code). Thus a rebuild > is technically not required. If the change is from public static final int I = 2; to public static final int I = 1 + 1; I'd accept a system that recompiles all users of I as still "minimal". (Not that it's a difficult optimization to make, since the rules for what's a compile-time constant are straightforward.) If this change isn't to a compile-time constant, it would have no effect on anything defined in a different source file. > > Or, a slightly less trivial case: Suppose we have class A which has 2 > public static functions A1 and A2 which have independent > implementations. Class C uses A1. Suppose someone comes along and > changes A2 in some meaningful way. That is, chaged its signature. Merely chaning its implementation would not rquire C to be recompiled. .. > Class C does not need to be > recompiled, but any sort of file level dependency analysis which is > correct would recompile Class C. Right, strictly minimal recompilation wouild need to know not only that C uses A but what parts of A it uses. Quite strightforward, though perhaps not a good idea. (It's possbile that gathering too much dependency information, and evaluationing it at too granular a level, slows the whole process down, compared to allowing some recompilations that are strictly speaking unnecessary.) > > If you define it as binary file contents as equivalent output files, > then it's not equivalent to the Halting problem, I think. However, if > you define it as "The output class files display the same visible > behavior across all valid input", then I think the general case \is\ > the Halting problem. If I understand you, you're distinguishing between: 1. A recompilation would result in the same class file that already exists, and 2. A recompilation would result in a change to the class file, but the result of running the two will always be the same I agree that a system that tries to determine 2 is infeasible. And my point about adding the comment isn't that the file itself would be recompiled. That's acceptable, and might even be necessary to get the line numbers seen by a debugger to be correct. My point is that all files which refer to it are recompiled, even though nothing they use (the class definitions, the field and method definitions, and the values of the compile-time constants) has changed. The same is true when e.g. method implementations change or an init block is added; I was using "add a comment" as the limiting case of a change no other file would care about. |