From: Joshua Cranmer on
On 05/14/2010 05:25 PM, Tom Anderson wrote:
> Also, what the hell is that getClass call all about? I see that in code
> i compile too (javac 1.6.0_16). A bit of googling reveals it's the code
> generated to force a null check of a variable, and this is used in
> compiling certain contortions involving inner classes. But there's no
> inner class here, and there is no way in a month of sundays that the top
> of stack can be null at instruction 12 - it's produced by applying dup
> to the result of new, and new can never produce a null (right?). So
> what's it doing?

I'm guessing that what happens is this:
new BB().x gets converted into an AST which is roughly a FieldAccess
where the object is (new BB()) (an opaque expression) and the field is
"x". The code generation sees an opaque expression--and expressions may
be null, so it does the check. The key is that it doesn't know that the
expression is of a type which cannot return null--Java does not do that
much static analysis at compile time, to my knowledge.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
From: Joshua Maurice on
Update! Initial estimates put my approach at ~7x (that's 700% more)
build times when calling javac once per java file after a javac
invocation on all of the java files in a directory. This is for my new
Ant-like tool, profiled with YourKit profiler. It's currently doing a
full clean rebuild of about ~2000 java files in about ~3 minutes, 37
seconds (which itself is a near meaningless measure without
understanding the content of the Java files, I know). Only about 15%
of the time is doing "useful" work. The rest is spent redoing javac
invocations to get the required dependency information.

I could probably speed that up by further parallelizing the separate
javac invocations on single files of each java source dir.

I posted a comment to an already existing bug report / enhancement
request over at Sun, I mean Oracle. I wonder if anyone will look or
care.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4639384

In the meantime, I fancy in my head modifying javac myself, though I'm
not sure how practical that would be to maintain privately going
forward.
From: Joshua Maurice on
On May 14, 3:44 pm, Joshua Maurice <joshuamaur...(a)gmail.com> wrote:
> On May 14, 2:25 pm, Tom Anderson <t...(a)urchin.earth.li> wrote:
> > On Wed, 12 May 2010, Joshua Maurice wrote:
> > > //AA.java
> > > public class AA { public final int x = 1; }
>
> > > //BB.java
> > > public class BB { public int x = new AA().x; }
>
> > > //javap -verbose -classpath . BB
> > > public BB();
> > >  Code:
> > >   Stack=3, Locals=1, Args_size=1
> > >   0:   aload_0
> > >   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
> > >   4:   aload_0
> > >   5:   new     #2; //class AA
> > >   8:   dup
> > >   9:   invokespecial   #3; //Method AA."<init>":()V
> > >   12:  invokevirtual   #4; //Method java/lang/Object.getClass:()Ljava/lang/Class;
> > >   15:  pop
> > >   16:  iconst_1
> > >   17:  putfield        #5; //Field x:I
> > >   20:  return
>
> > > Specifically note that the instructions to initialize BB.x involve
> > > "iconst_1", which, as I understand it, puts the constant 1 on the
> > > stack. javac, even with -g, inlined the value of a final not-static
> > > int field.
>
> > Yeah, this is a bit weird.
>
> > Also, what the hell is that getClass call all about? I see that in code i
> > compile too (javac 1.6.0_16). A bit of googling reveals it's the code
> > generated to force a null check of a variable, and this is used in
> > compiling certain contortions involving inner classes. But there's no
> > inner class here, and there is no way in a month of sundays that the top
> > of stack can be null at instruction 12 - it's produced by applying dup to
> > the result of new, and new can never produce a null (right?). So what's it
> > doing?
>
> > Anyway, turning back to the initialisation of x. if you look at the
> > bytecode of AA, that's also weird. It has a constructor which does
> > iconst_1 + putfield to initialise x - but x *also* has a ConstantValue
> > attribute, giving it the value 1. Why both? If you write a verion of AA
> > where x is static, then there's only a ConstantValue, and no synthetic
> > clinit or anything touching it. Or instead make it non-final, and of
> > course it keeps the constructor but loses the ConstantValue.
>
> > The good news is that it looks like you can detect 'silently inlinable'
> > variables by the presence of a ConstantValue attribute. The bad news is
> > that javac does seem to be violating the VM spec (AIUI) here.
>
> > And on the gripping hand, you still have no way to discover the relevance
> > of AA from CC (the class you mention in a later post).
>
> > When i looked into this a while ago, my planned approach was:
>
> > 1. Keep a table of explicit dependencies between classes (ie CC -> BB, but
> >     not CC -> AA)
>
> > 2. Keep a tree of direct inheritance relationships, probably including
> >     interface implementation (ie BB -> AA)
>
> > 3. Define the 'signature' of a class to be the aggregation of its
> >     kind (class or interface), name, list of direct supertypes, the names
> >     and types of its non-private fields, the values of its constant fields,
> >     and the names, parameter types, return types, and exception lists of
> >     its methods. Anything else?
>
> > 4. When a source file changes, recompile, and compare the signature of the
> >     new class to that of the old class
>
> > 5. If the signature has changed, walk the inheritance tree, and build
> >     the set of all classes which descend from the class - call this,
> >     including the original class, the family.
>
> > 6. Use the dependency table to find every class which depends on a member
> >     of the family. Call these the friends.
>
> > 7. Recompile the family and friends.
>
> > 8. Repeat the analysis on the newly recompiled files; this is necessary
> >     because changes to constant values can propagate.
>
> > If you extend javap to report constant field values, then you can use the
> > hash of the output of javap has a practical stand-in for a complete
> > signature. It's a bit oversensitive, because it will change if you add or
> > remove a static block, or cause the set of secret inner-class backdoor
> > methods to change, neither of which really change the signature.
>
> > I didn't know about ghost dependencies, so i didn't deal with those at
> > all. But on that subject - am i right in thinking that to build the set of
> > ghost dependencies, you need to know every name used by the class? If so,
> > doesn't that already cover this situation? CC uses the name BB.x, and
> > presumably you have to have an inheritance rule like the above that means
> > that a change to AA.x means a change to BB.x if there is no actual BB.x..
>
> Seehttp://www.jot.fm/issues/issue_2004_12/article4.pdf
> for "ghost dependencies".
>
> I don't think as presented in the paper that ghost dependencies will
> catch this. Again, take the example
> //AA.java
> public class AA { public final int x = 1; }
> //BB.java
> public class BB extends AA {}
> //CC.java
> public class CC { public final int x = new BB().x; }
>
> CC.java has ghost dependencies "CC", "BB", "x", aka all names in the
> class file (using the Java technical definition of "name" as a single
> identifier, or a list of identifiers separated by dots '.'), then get
> all possible interpretations under all imports (including the implicit
> import <this-package>.*;), then close over all such prefixes. (Or
> something like that. The details are somewhat involved. See the
> paper.)
>
> AA.class exports the name "AA", aka the full name of the class.
> BB.class exports the name "BB", aka the full name of the class.
>
> I'm not sure offhand if there is a good way to extend ghost
> dependencies to catch this case without introduces a lot of false
> positives.
>
> --
> I've also given some thought as you had to maintain this list keeping
> track of super classes. I'm not sure how it would interact with this
> example:
>
> //AAA.java
> public class AAA { public static int aaa = 1; }
> //BBB.java
> public class BBB { public static AAA bbb = null; }
> //CCC.java
> public class CCC { public static BBB ccc = null; }
> //DDD.java
> public class DDD { public final int ddd = CCC.ccc.bbb.aaa; }
>
> If we chance AAA.aaa to "public static double aaa = 2", then BBB.class
> would be a noop recompile, CCC.class would be a noop recompile, but
> DDD.class would need a recompile. Again, I think I would need the same
> information to make this work without endless cascading; I would need
> to know that DDD (directly) uses AAA. I thus think that your / my
> scheme of keeping tracking of super classes would not be terribly
> effective / productive.

I might have to backtrack and/or apologize. I've actually come back to
this idea here, and I'm thinking it could work decently well.
Specifically, the rules would be:

1- A java file's compilation is out of date when its source file has
been modified since the last compilation.

2- A java file's compilation is out of date when it has a newer Ghost
Dependency, see paper: www.jot.fm/issues/issue_2004_12/article4.pdf

3- A java file's compilation is out of date when one of its output
class files has a reference to a type
3a- whose class file has a last "interface changed" time which is
newer than the java file's last compilation,
3b- or which is in an output class file of an "out of date" java file
which is part of this javac task,
3c- or which has a super type (direct or transitive) whose class has a
last "interface changed" time which is newer than the java file's last
compilation,
3d- or which has a super type (direct or transitive) which is in an
output class file of an "out of date" java file which is part of this
javac task.

4a- A java file's compilation is out of date when
- it has a potentially used constant variable field simple name X
(which is basically any simple name of any name in the source),
- and there is a class file on the compile classpath which "exports" a
constant variable field which has simple name X,
- and the "exported" constant variable field has a "last changed" time
which is newer than the java file's last compilation.
4b- A java file's compilation is out of date when
- it has a potentially used constant variable field simple name X
(which is basically any simple name of any name in the source),
- and there is an "out of date" java file in this javac task which has
a class file which "exports" a constant variable field which has
simple name X.

I just thought this up today from a small discussion on an OpenJDK
mailing list, and do to a couple of realizations about how javac
internally works, specifically that I think closing dependencies over
all super types (direct and transitive) of the dependency would be
equivalent to using javac's -verbose output.

What remains to be seen is if there's any other corner case which I'm
missing.
From: Joshua Maurice on
On May 25, 5:18 pm, Joshua Maurice <joshuamaur...(a)gmail.com> wrote:
> I've actually come back to
> this idea here, and I'm thinking it could work decently well.
> Specifically, the rules would be:
>
> 1- A java file's compilation is out of date when its source file has
> been modified since the last compilation.
>
> 2- A java file's compilation is out of date when it has a newer Ghost
> Dependency, see paper:www.jot.fm/issues/issue_2004_12/article4.pdf
>
> 3- A java file's compilation is out of date when one of its output
> class files has a reference to a type
> 3a- whose class file has a last "interface changed" time which is
> newer than the java file's last compilation,
> 3b- or which is in an output class file of an "out of date" java file
> which is part of this javac task,
> 3c- or which has a super type (direct or transitive) whose class has a
> last "interface changed" time which is newer than the java file's last
> compilation,
> 3d- or which has a super type (direct or transitive) which is in an
> output class file of an "out of date" java file which is part of this
> javac task.
>
> 4a- A java file's compilation is out of date when
> - it has a potentially used constant variable field simple name X
> (which is basically any simple name of any name in the source),
> - and there is a class file on the compile classpath which "exports" a
> constant variable field which has simple name X,
> - and the "exported" constant variable field has a "last changed" time
> which is newer than the java file's last compilation.
> 4b- A java file's compilation is out of date when
> - it has a potentially used constant variable field simple name X
> (which is basically any simple name of any name in the source),
> - and there is an "out of date" java file in this javac task which has
> a class file which "exports" a constant variable field which has
> simple name X.
>
> I just thought this up today from a small discussion on an OpenJDK
> mailing list, and do to a couple of realizations about how javac
> internally works, specifically that I think closing dependencies over
> all super types (direct and transitive) of the dependency would be
> equivalent to using javac's -verbose output.
>
> What remains to be seen is if there's any other corner case which I'm
> missing.

Further update. I implemented it, and my tests failed. The above rules
do not catch the following:

<root of test>/aa/src/main/java/T3.java ... contents
public class T3 { public static final int C = 1; }
<root of test>/bb/src/main/java/T2.java ... contents
public class T2 extends T3 {}
<root of test>/cc/src/main/java/T1.java ... contents
public class T1 extends T2 {}
<root of test>/dd/src/main/java/Test.java ... contents
public class Test { public static final int D = T1.C; }

Each separate directory under <root of test> is a different javac
task. The first build goes like:
- Enter <root>/aa. T3.java has not been built before. Build it now
with a single javac invocation.
- etc. for bb, cc, and dd.

Now, a developer comes in and modifies
<root of test>/bb/src/main/java/T2.java
to
public class T2 extends T3 { int x; }

A second build will come along and not find a rule which declares
Test.java to be "out of date". The problem is again constant variable
fields. The constant variable field T1.C is expanded inline in
Test.class, so I have no reference to the dependency of Test.java to
T1. I thought I might be able to handle constant variable fields with
special rules, but I don't think it will work. The fields can be
"hidden", and catching that with any more hacks is not worth the
trouble to me at the moment.

In other fronts, the OpenJDK mailing list discussion gave me a useful
piece of insight. Apparently one can use the JavacTask API in
tools.jar to get the types of "nodes" in the parse tree. Specifically,
call parse, save the CompilationUnitTree's, then call analyze, then
use ?? to get the types of the "nodes" in the saved
CompilationUnitTree. I have taken brief looks over the type APIs
several times, type mirrors and such, but I'm mostly at a loss. I'll
have to spend some significant time googling and playing around with
those to figure out how to do it, but I was hoping perhaps someone
here has had more experience with this library and can point me to an
example somewhere, or provide an example. Please?
From: Joshua Maurice on
On May 26, 11:53 pm, Joshua Maurice <joshuamaur...(a)gmail.com> wrote:
> [...]
> Now, a developer comes in and modifies
>   <root of test>/bb/src/main/java/T2.java
> to
>   public class T2 extends T3 { int x; }

Err, that should read
public class T2 extends T3 { int C = 3; }
so that the new field "C" "hides" the super class's field "C".