From: Joshua Cranmer on 18 May 2010 15:19 On 05/14/2010 05:25 PM, Tom Anderson wrote: > Also, what the hell is that getClass call all about? I see that in code > i compile too (javac 1.6.0_16). A bit of googling reveals it's the code > generated to force a null check of a variable, and this is used in > compiling certain contortions involving inner classes. But there's no > inner class here, and there is no way in a month of sundays that the top > of stack can be null at instruction 12 - it's produced by applying dup > to the result of new, and new can never produce a null (right?). So > what's it doing? I'm guessing that what happens is this: new BB().x gets converted into an AST which is roughly a FieldAccess where the object is (new BB()) (an opaque expression) and the field is "x". The code generation sees an opaque expression--and expressions may be null, so it does the check. The key is that it doesn't know that the expression is of a type which cannot return null--Java does not do that much static analysis at compile time, to my knowledge. -- Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald E. Knuth
From: Joshua Maurice on 20 May 2010 03:45 Update! Initial estimates put my approach at ~7x (that's 700% more) build times when calling javac once per java file after a javac invocation on all of the java files in a directory. This is for my new Ant-like tool, profiled with YourKit profiler. It's currently doing a full clean rebuild of about ~2000 java files in about ~3 minutes, 37 seconds (which itself is a near meaningless measure without understanding the content of the Java files, I know). Only about 15% of the time is doing "useful" work. The rest is spent redoing javac invocations to get the required dependency information. I could probably speed that up by further parallelizing the separate javac invocations on single files of each java source dir. I posted a comment to an already existing bug report / enhancement request over at Sun, I mean Oracle. I wonder if anyone will look or care. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4639384 In the meantime, I fancy in my head modifying javac myself, though I'm not sure how practical that would be to maintain privately going forward.
From: Joshua Maurice on 25 May 2010 20:18 On May 14, 3:44 pm, Joshua Maurice <joshuamaur...(a)gmail.com> wrote: > On May 14, 2:25 pm, Tom Anderson <t...(a)urchin.earth.li> wrote: > > On Wed, 12 May 2010, Joshua Maurice wrote: > > > //AA.java > > > public class AA { public final int x = 1; } > > > > //BB.java > > > public class BB { public int x = new AA().x; } > > > > //javap -verbose -classpath . BB > > > public BB(); > > > Code: > > > Stack=3, Locals=1, Args_size=1 > > > 0: aload_0 > > > 1: invokespecial #1; //Method java/lang/Object."<init>":()V > > > 4: aload_0 > > > 5: new #2; //class AA > > > 8: dup > > > 9: invokespecial #3; //Method AA."<init>":()V > > > 12: invokevirtual #4; //Method java/lang/Object.getClass:()Ljava/lang/Class; > > > 15: pop > > > 16: iconst_1 > > > 17: putfield #5; //Field x:I > > > 20: return > > > > Specifically note that the instructions to initialize BB.x involve > > > "iconst_1", which, as I understand it, puts the constant 1 on the > > > stack. javac, even with -g, inlined the value of a final not-static > > > int field. > > > Yeah, this is a bit weird. > > > Also, what the hell is that getClass call all about? I see that in code i > > compile too (javac 1.6.0_16). A bit of googling reveals it's the code > > generated to force a null check of a variable, and this is used in > > compiling certain contortions involving inner classes. But there's no > > inner class here, and there is no way in a month of sundays that the top > > of stack can be null at instruction 12 - it's produced by applying dup to > > the result of new, and new can never produce a null (right?). So what's it > > doing? > > > Anyway, turning back to the initialisation of x. if you look at the > > bytecode of AA, that's also weird. It has a constructor which does > > iconst_1 + putfield to initialise x - but x *also* has a ConstantValue > > attribute, giving it the value 1. Why both? If you write a verion of AA > > where x is static, then there's only a ConstantValue, and no synthetic > > clinit or anything touching it. Or instead make it non-final, and of > > course it keeps the constructor but loses the ConstantValue. > > > The good news is that it looks like you can detect 'silently inlinable' > > variables by the presence of a ConstantValue attribute. The bad news is > > that javac does seem to be violating the VM spec (AIUI) here. > > > And on the gripping hand, you still have no way to discover the relevance > > of AA from CC (the class you mention in a later post). > > > When i looked into this a while ago, my planned approach was: > > > 1. Keep a table of explicit dependencies between classes (ie CC -> BB, but > > not CC -> AA) > > > 2. Keep a tree of direct inheritance relationships, probably including > > interface implementation (ie BB -> AA) > > > 3. Define the 'signature' of a class to be the aggregation of its > > kind (class or interface), name, list of direct supertypes, the names > > and types of its non-private fields, the values of its constant fields, > > and the names, parameter types, return types, and exception lists of > > its methods. Anything else? > > > 4. When a source file changes, recompile, and compare the signature of the > > new class to that of the old class > > > 5. If the signature has changed, walk the inheritance tree, and build > > the set of all classes which descend from the class - call this, > > including the original class, the family. > > > 6. Use the dependency table to find every class which depends on a member > > of the family. Call these the friends. > > > 7. Recompile the family and friends. > > > 8. Repeat the analysis on the newly recompiled files; this is necessary > > because changes to constant values can propagate. > > > If you extend javap to report constant field values, then you can use the > > hash of the output of javap has a practical stand-in for a complete > > signature. It's a bit oversensitive, because it will change if you add or > > remove a static block, or cause the set of secret inner-class backdoor > > methods to change, neither of which really change the signature. > > > I didn't know about ghost dependencies, so i didn't deal with those at > > all. But on that subject - am i right in thinking that to build the set of > > ghost dependencies, you need to know every name used by the class? If so, > > doesn't that already cover this situation? CC uses the name BB.x, and > > presumably you have to have an inheritance rule like the above that means > > that a change to AA.x means a change to BB.x if there is no actual BB.x.. > > Seehttp://www.jot.fm/issues/issue_2004_12/article4.pdf > for "ghost dependencies". > > I don't think as presented in the paper that ghost dependencies will > catch this. Again, take the example > //AA.java > public class AA { public final int x = 1; } > //BB.java > public class BB extends AA {} > //CC.java > public class CC { public final int x = new BB().x; } > > CC.java has ghost dependencies "CC", "BB", "x", aka all names in the > class file (using the Java technical definition of "name" as a single > identifier, or a list of identifiers separated by dots '.'), then get > all possible interpretations under all imports (including the implicit > import <this-package>.*;), then close over all such prefixes. (Or > something like that. The details are somewhat involved. See the > paper.) > > AA.class exports the name "AA", aka the full name of the class. > BB.class exports the name "BB", aka the full name of the class. > > I'm not sure offhand if there is a good way to extend ghost > dependencies to catch this case without introduces a lot of false > positives. > > -- > I've also given some thought as you had to maintain this list keeping > track of super classes. I'm not sure how it would interact with this > example: > > //AAA.java > public class AAA { public static int aaa = 1; } > //BBB.java > public class BBB { public static AAA bbb = null; } > //CCC.java > public class CCC { public static BBB ccc = null; } > //DDD.java > public class DDD { public final int ddd = CCC.ccc.bbb.aaa; } > > If we chance AAA.aaa to "public static double aaa = 2", then BBB.class > would be a noop recompile, CCC.class would be a noop recompile, but > DDD.class would need a recompile. Again, I think I would need the same > information to make this work without endless cascading; I would need > to know that DDD (directly) uses AAA. I thus think that your / my > scheme of keeping tracking of super classes would not be terribly > effective / productive. I might have to backtrack and/or apologize. I've actually come back to this idea here, and I'm thinking it could work decently well. Specifically, the rules would be: 1- A java file's compilation is out of date when its source file has been modified since the last compilation. 2- A java file's compilation is out of date when it has a newer Ghost Dependency, see paper: www.jot.fm/issues/issue_2004_12/article4.pdf 3- A java file's compilation is out of date when one of its output class files has a reference to a type 3a- whose class file has a last "interface changed" time which is newer than the java file's last compilation, 3b- or which is in an output class file of an "out of date" java file which is part of this javac task, 3c- or which has a super type (direct or transitive) whose class has a last "interface changed" time which is newer than the java file's last compilation, 3d- or which has a super type (direct or transitive) which is in an output class file of an "out of date" java file which is part of this javac task. 4a- A java file's compilation is out of date when - it has a potentially used constant variable field simple name X (which is basically any simple name of any name in the source), - and there is a class file on the compile classpath which "exports" a constant variable field which has simple name X, - and the "exported" constant variable field has a "last changed" time which is newer than the java file's last compilation. 4b- A java file's compilation is out of date when - it has a potentially used constant variable field simple name X (which is basically any simple name of any name in the source), - and there is an "out of date" java file in this javac task which has a class file which "exports" a constant variable field which has simple name X. I just thought this up today from a small discussion on an OpenJDK mailing list, and do to a couple of realizations about how javac internally works, specifically that I think closing dependencies over all super types (direct and transitive) of the dependency would be equivalent to using javac's -verbose output. What remains to be seen is if there's any other corner case which I'm missing.
From: Joshua Maurice on 27 May 2010 02:53 On May 25, 5:18 pm, Joshua Maurice <joshuamaur...(a)gmail.com> wrote: > I've actually come back to > this idea here, and I'm thinking it could work decently well. > Specifically, the rules would be: > > 1- A java file's compilation is out of date when its source file has > been modified since the last compilation. > > 2- A java file's compilation is out of date when it has a newer Ghost > Dependency, see paper:www.jot.fm/issues/issue_2004_12/article4.pdf > > 3- A java file's compilation is out of date when one of its output > class files has a reference to a type > 3a- whose class file has a last "interface changed" time which is > newer than the java file's last compilation, > 3b- or which is in an output class file of an "out of date" java file > which is part of this javac task, > 3c- or which has a super type (direct or transitive) whose class has a > last "interface changed" time which is newer than the java file's last > compilation, > 3d- or which has a super type (direct or transitive) which is in an > output class file of an "out of date" java file which is part of this > javac task. > > 4a- A java file's compilation is out of date when > - it has a potentially used constant variable field simple name X > (which is basically any simple name of any name in the source), > - and there is a class file on the compile classpath which "exports" a > constant variable field which has simple name X, > - and the "exported" constant variable field has a "last changed" time > which is newer than the java file's last compilation. > 4b- A java file's compilation is out of date when > - it has a potentially used constant variable field simple name X > (which is basically any simple name of any name in the source), > - and there is an "out of date" java file in this javac task which has > a class file which "exports" a constant variable field which has > simple name X. > > I just thought this up today from a small discussion on an OpenJDK > mailing list, and do to a couple of realizations about how javac > internally works, specifically that I think closing dependencies over > all super types (direct and transitive) of the dependency would be > equivalent to using javac's -verbose output. > > What remains to be seen is if there's any other corner case which I'm > missing. Further update. I implemented it, and my tests failed. The above rules do not catch the following: <root of test>/aa/src/main/java/T3.java ... contents public class T3 { public static final int C = 1; } <root of test>/bb/src/main/java/T2.java ... contents public class T2 extends T3 {} <root of test>/cc/src/main/java/T1.java ... contents public class T1 extends T2 {} <root of test>/dd/src/main/java/Test.java ... contents public class Test { public static final int D = T1.C; } Each separate directory under <root of test> is a different javac task. The first build goes like: - Enter <root>/aa. T3.java has not been built before. Build it now with a single javac invocation. - etc. for bb, cc, and dd. Now, a developer comes in and modifies <root of test>/bb/src/main/java/T2.java to public class T2 extends T3 { int x; } A second build will come along and not find a rule which declares Test.java to be "out of date". The problem is again constant variable fields. The constant variable field T1.C is expanded inline in Test.class, so I have no reference to the dependency of Test.java to T1. I thought I might be able to handle constant variable fields with special rules, but I don't think it will work. The fields can be "hidden", and catching that with any more hacks is not worth the trouble to me at the moment. In other fronts, the OpenJDK mailing list discussion gave me a useful piece of insight. Apparently one can use the JavacTask API in tools.jar to get the types of "nodes" in the parse tree. Specifically, call parse, save the CompilationUnitTree's, then call analyze, then use ?? to get the types of the "nodes" in the saved CompilationUnitTree. I have taken brief looks over the type APIs several times, type mirrors and such, but I'm mostly at a loss. I'll have to spend some significant time googling and playing around with those to figure out how to do it, but I was hoping perhaps someone here has had more experience with this library and can point me to an example somewhere, or provide an example. Please?
From: Joshua Maurice on 27 May 2010 03:12
On May 26, 11:53 pm, Joshua Maurice <joshuamaur...(a)gmail.com> wrote: > [...] > Now, a developer comes in and modifies > <root of test>/bb/src/main/java/T2.java > to > public class T2 extends T3 { int x; } Err, that should read public class T2 extends T3 { int C = 3; } so that the new field "C" "hides" the super class's field "C". |