From: Joshua Maurice on 9 May 2010 07:33 On May 9, 1:31 am, "Mike Schilling" <mscottschill...(a)hotmail.com> wrote: > Joshua Maurice wrote: > > On May 7, 9:08 pm, "Mike Schilling" <mscottschill...(a)hotmail.com> > > wrote: > >> Joshua Maurice wrote: > >>> If anyone else cares, I managed to inadvertently stumble across a > >>> solution. On impulse, I asked a co-worker at lunch. It seems that > >>> class files do not contain sufficient information with default javac > >>> options. However, when compiled with -g, it contains a listing of > >>> all types used in the compile. When combined with Ghost > >>> Dependencies, I think this can result in a correct incremental > >>> build at the file level which will not cascade endlessly > >>> downstream. I'm working on the finishing touches to my prototype > >>> now. > > >> You realize that you're now going to recompile a class when it > >> refers to another class to which a comment was added. > > > Yes. I'm pretty sure that it would be better than doing a full clean > > build or a cascading jar-dir-unit incremental build. > > No doubt, but the result isn't the minimal amount of recompilation we were > discussing earlier. I'm not sure what this minimal recompile which we were discussing is. It is technically impossible to do a true minimal recompile algorithmically. Let's define it as "Let a build be a set of file compilations. Let the minimum recompile be the minimum such set for which the output class file are equivalent to the class files of a full clean build." First, we'd have to prove such a minimum exists. That's relatively straightforward. With that out of the way, I think I could then prove that the problem is equivalent to the Halting problem. If you define "equivalent" generously, I'm pretty sure this is the case. If you define it as "same binary file content", then perhaps not, though still possibly yes. Either way, this is not my goal. If someone modifies comments to a Java source file, I'm not going to try and catch that. What I will do is recompile all files which depend directly on that changed-source Java file, any files affected by Ghost Dependencies, and continue cascading this change down until all of the "leaves" of the cascading recompile are binary equivalent class files to the class files before the recompile. Perhaps too conservative, but I think that's easy enough to show that it's correct. Perhaps I'll make it a "tighter fit" later, though honestly I'm still fumbling around in the dark at the moment, still learning.
From: Tom Anderson on 9 May 2010 10:49 On Sun, 9 May 2010, Joshua Maurice wrote: > On May 9, 1:31�am, "Mike Schilling" <mscottschill...(a)hotmail.com> > wrote: > >> No doubt, but the result isn't the minimal amount of recompilation we were >> discussing earlier. > > I'm not sure what this minimal recompile which we were discussing is. It > is technically impossible to do a true minimal recompile > algorithmically. Let's define it as "Let a build be a set of file > compilations. Let the minimum recompile be the minimum such set for > which the output class file are equivalent to the class files of a full > clean build." First, we'd have to prove such a minimum exists. That's > relatively straightforward. Extremely so. > With that out of the way, I think I could then prove that the problem is > equivalent to the Halting problem. Certainly not. > If you define "equivalent" generously, I'm pretty sure this is the case. > If you define it as "same binary file content", then perhaps not, though > still possibly yes. I'm not sure what you mean by 'generously'. Is there a kind of equivalence less strict than binary equivalence which would actually work? Anyway, here's a straightforward but slow algorithm to find the minimal recompile: 1. Copy all your source code somewhere and do a clean build on it; call the output the reference output 2. Count your source files, and call the total number N 3. Number all your source files, starting at 0 and going up to N 4. Let M be the set of all source files 5. For each integer i between 0 and 2**N - 1: 6a. Let S be the set of source files for whose number j, the jth bit in i is set 6b. Do a recompilation of just the files in S 6c. Compare the output to the reference output, and if it is identical, and the size of S is smaller than the size of M, let M be S 6d. Restore the class files to how they were before recompilation S now contains the set of source files needed for a minimal recompile. It doesn't follow from this algorithm that it's the only minimal set, although i suspect that in practice it will be. I wouldn't suggest you do this in practice, but it shows that the minimal set exists, can be found algorithmically, and can be found in O(2**N) time, with a rather large constant. Your task is thus merely to improve the speed! > Either way, this is not my goal. If someone modifies comments to a Java > source file, I'm not going to try and catch that. What I will do is > recompile all files which depend directly on that changed-source Java > file, any files affected by Ghost Dependencies, and continue cascading > this change down until all of the "leaves" of the cascading recompile > are binary equivalent class files to the class files before the > recompile. Perhaps too conservative, but I think that's easy enough to > show that it's correct. Agreed. If you were a bit more aggressive about the consequentiality of changes, you could prune off a lot of the leaves of the tree, but it wouldn't be an asymptotic speedup. That said, i don't think it would be that hard to work out consequentiality. The output of javap is almost exactly what you need - i think the only thing it's missing is those bloody constant values. Adding them doesn't look hard. This is the relevant bit of javap's source: https://openjdk.dev.java.net/source/browse/openjdk/jdk/trunk/langtools/src/share/classes/sun/tools/javap/JavapPrinter.java?rev=257&view=markup You need to change the line that says: out.println(fields[f].getType()+" " +fields[f].getName()+";"); To say: int cpx = fields[f].getConstantValueIndex(); if (cpx == 0) out.println(fields[f].getType()+" " +fields[f].getName()+";"); else out.println(fields[f].getType()+" " +fields[f].getName()+" = "+cls.getCpoolEntryobj(cpx)+";"); That adds the value of any compile-time constants to the output. I'm not sure if it will also add values to instance fields which have initial values; i think those are handled in the constructors, rather than as ConstantValue attributes. You could then compare the output of javap, or hashes of that output, to determine if the interface of the class had changed. If it hasn't, then any changes to the class file are inconsequential in terms of recompilation. Unless i've missed something. Notably absent from javap output is annotations - can the annotations on a class affect compilation of other classes which refer to it? @Override only affects the declaring class. @Deprecated could affect another class, but could only cause a warning to be generated. > Perhaps I'll make it a "tighter fit" later, though honestly I'm still > fumbling around in the dark at the moment, still learning. PROTIP: that phase never actually ends. tom -- coincidences, body modification, hungarian voice sebestyen marta, **
From: Tom Anderson on 9 May 2010 11:26 On Sun, 9 May 2010, Joshua Maurice wrote: > On May 9, 3:23�am, Tom Anderson <t...(a)urchin.earth.li> wrote: >> On Sat, 8 May 2010, Joshua Maurice wrote: >>> On May 7, 9:08�pm, "Mike Schilling" <mscottschill...(a)hotmail.com> >>> wrote: >>>> Joshua Maurice wrote: >>>>> If anyone else cares, I managed to inadvertently stumble across a >>>>> solution. On impulse, I asked a co-worker at lunch. It seems that >>>>> class files do not contain sufficient information with default javac >>>>> options. However, when compiled with -g, it contains a listing of all >>>>> types used in the compile. >> >> Are you sure? >> >> $ javac -version >> javac 1.6.0_16 >> $ echo "class Foo {public static final int X=23;}" >Foo.java >> $ echo "class Bar {public static final int Y=Foo.X;}" >Bar.java >> $ javac -g Foo.java Bar.java >> $ grep Foo Bar.class >> $ >> >> I can see no sign of Bar.class containing any mention of Foo. > > Apparently I am mistaken. I would suggest looking for "Bar" and not > "Bar.class", but the result is the same. static finals might be an > exception to the debug information, which is sad. I'm wondering how > I'll work around this now. I wonder how hard it would be to modify javac to add an attribute to classes to record the origins of any inlined constants. That would let you pull the information out from the class file later on, just as you can already do with all the other types of dependency. A bit of a poke around in the javac source code suggests it already has code for tracking dependencies, which is there to support JWS in some way. There are some flags that should switch it on (-xdepend, -Xdepend, -Xjws), but they don't work on the version i have installled. There's also some flag you can set on the compilation environment object that will make it print dependencies, so if you're driving compilation from code, you should be able to set that. You'd have to parse the compiler output, but that's not that bad. > I am still in the process of implementing, so I haven't really been able > to test, You're doing it wrong. Test first, test incrementally, build things in a way you can test as you go. Before you start work for real, do a higher-level test, a 'spike solution', to make sure that all your assumptions (like this one) are valid. > Hopefully this is the only such corner case. I need more tests. Always true! Since you have this vast and terrifying codebase, you can probably generate some pretty thorough functional tests from it. Take pairs of adjacent revisions from source control, do full builds on each, find the differences in output, then apply your tool and see if it comes up with the right answers. You should be able to automate the process of turning a pair of revisions into a test suite, and then you can just leave it to crank away generating them for a few days. tom -- coincidences, body modification, hungarian voice sebestyen marta, **
From: Mike Schilling on 9 May 2010 12:30 Joshua Maurice wrote: > On May 9, 1:31 am, "Mike Schilling" <mscottschill...(a)hotmail.com> > wrote: >> Joshua Maurice wrote: >>> On May 7, 9:08 pm, "Mike Schilling" <mscottschill...(a)hotmail.com> >>> wrote: >>>> Joshua Maurice wrote: >>>>> If anyone else cares, I managed to inadvertently stumble across a >>>>> solution. On impulse, I asked a co-worker at lunch. It seems that >>>>> class files do not contain sufficient information with default >>>>> javac options. However, when compiled with -g, it contains a >>>>> listing of all types used in the compile. When combined with Ghost >>>>> Dependencies, I think this can result in a correct incremental >>>>> build at the file level which will not cascade endlessly >>>>> downstream. I'm working on the finishing touches to my prototype >>>>> now. >> >>>> You realize that you're now going to recompile a class when it >>>> refers to another class to which a comment was added. >> >>> Yes. I'm pretty sure that it would be better than doing a full clean >>> build or a cascading jar-dir-unit incremental build. >> >> No doubt, but the result isn't the minimal amount of recompilation >> we were discussing earlier. > > I'm not sure what this minimal recompile which we were discussing is. > It is technically impossible to do a true minimal recompile > algorithmically. Let's define it as "Let a build be a set of file > compilations. Let the minimum recompile be the minimum such set for > which the output class file are equivalent to the class files of a > full clean build." First, we'd have to prove such a minimum exists. > That's relatively straightforward. With that out of the way, I think I > could then prove that the problem is equivalent to the Halting > problem. If you define "equivalent" generously, I'm pretty sure this > is the case. If you define it as "same binary file content", then > perhaps not, though still possibly yes. Why would you think that? The ways in which a change to A can affect B is finite and well-defined.
From: EJP on 9 May 2010 19:42
On 10/05/2010 1:26 AM, Tom Anderson wrote: Doesn't/didn't 'jikes' do all this? |