From: Joshua Maurice on
On May 9, 1:31 am, "Mike Schilling" <mscottschill...(a)hotmail.com>
wrote:
> Joshua Maurice wrote:
> > On May 7, 9:08 pm, "Mike Schilling" <mscottschill...(a)hotmail.com>
> > wrote:
> >> Joshua Maurice wrote:
> >>> If anyone else cares, I managed to inadvertently stumble across a
> >>> solution. On impulse, I asked a co-worker at lunch. It seems that
> >>> class files do not contain sufficient information with default javac
> >>> options. However, when compiled with -g, it contains a listing of
> >>> all types used in the compile. When combined with Ghost
> >>> Dependencies, I think this can result in a correct incremental
> >>> build at the file level which will not cascade endlessly
> >>> downstream. I'm working on the finishing touches to my prototype
> >>> now.
>
> >> You realize that you're now going to recompile a class when it
> >> refers to another class to which a comment was added.
>
> > Yes. I'm pretty sure that it would be better than doing a full clean
> > build or a cascading jar-dir-unit incremental build.
>
> No doubt, but the result isn't the minimal amount of recompilation we were
> discussing earlier.

I'm not sure what this minimal recompile which we were discussing is.
It is technically impossible to do a true minimal recompile
algorithmically. Let's define it as "Let a build be a set of file
compilations. Let the minimum recompile be the minimum such set for
which the output class file are equivalent to the class files of a
full clean build." First, we'd have to prove such a minimum exists.
That's relatively straightforward. With that out of the way, I think I
could then prove that the problem is equivalent to the Halting
problem. If you define "equivalent" generously, I'm pretty sure this
is the case. If you define it as "same binary file content", then
perhaps not, though still possibly yes.

Either way, this is not my goal. If someone modifies comments to a
Java source file, I'm not going to try and catch that. What I will do
is recompile all files which depend directly on that changed-source
Java file, any files affected by Ghost Dependencies, and continue
cascading this change down until all of the "leaves" of the cascading
recompile are binary equivalent class files to the class files before
the recompile. Perhaps too conservative, but I think that's easy
enough to show that it's correct. Perhaps I'll make it a "tighter fit"
later, though honestly I'm still fumbling around in the dark at the
moment, still learning.
From: Tom Anderson on
On Sun, 9 May 2010, Joshua Maurice wrote:

> On May 9, 1:31�am, "Mike Schilling" <mscottschill...(a)hotmail.com>
> wrote:
>
>> No doubt, but the result isn't the minimal amount of recompilation we were
>> discussing earlier.
>
> I'm not sure what this minimal recompile which we were discussing is. It
> is technically impossible to do a true minimal recompile
> algorithmically. Let's define it as "Let a build be a set of file
> compilations. Let the minimum recompile be the minimum such set for
> which the output class file are equivalent to the class files of a full
> clean build." First, we'd have to prove such a minimum exists. That's
> relatively straightforward.

Extremely so.

> With that out of the way, I think I could then prove that the problem is
> equivalent to the Halting problem.

Certainly not.

> If you define "equivalent" generously, I'm pretty sure this is the case.
> If you define it as "same binary file content", then perhaps not, though
> still possibly yes.

I'm not sure what you mean by 'generously'. Is there a kind of equivalence
less strict than binary equivalence which would actually work?

Anyway, here's a straightforward but slow algorithm to find the minimal
recompile:

1. Copy all your source code somewhere and do a clean build on it; call
the output the reference output
2. Count your source files, and call the total number N
3. Number all your source files, starting at 0 and going up to N
4. Let M be the set of all source files
5. For each integer i between 0 and 2**N - 1:
6a. Let S be the set of source files for whose number j, the jth bit in i
is set
6b. Do a recompilation of just the files in S
6c. Compare the output to the reference output, and if it is identical,
and the size of S is smaller than the size of M, let M be S
6d. Restore the class files to how they were before recompilation

S now contains the set of source files needed for a minimal recompile. It
doesn't follow from this algorithm that it's the only minimal set,
although i suspect that in practice it will be.

I wouldn't suggest you do this in practice, but it shows that the minimal
set exists, can be found algorithmically, and can be found in O(2**N)
time, with a rather large constant. Your task is thus merely to improve
the speed!

> Either way, this is not my goal. If someone modifies comments to a Java
> source file, I'm not going to try and catch that. What I will do is
> recompile all files which depend directly on that changed-source Java
> file, any files affected by Ghost Dependencies, and continue cascading
> this change down until all of the "leaves" of the cascading recompile
> are binary equivalent class files to the class files before the
> recompile. Perhaps too conservative, but I think that's easy enough to
> show that it's correct.

Agreed. If you were a bit more aggressive about the consequentiality of
changes, you could prune off a lot of the leaves of the tree, but it
wouldn't be an asymptotic speedup.

That said, i don't think it would be that hard to work out
consequentiality. The output of javap is almost exactly what you need - i
think the only thing it's missing is those bloody constant values. Adding
them doesn't look hard. This is the relevant bit of javap's source:

https://openjdk.dev.java.net/source/browse/openjdk/jdk/trunk/langtools/src/share/classes/sun/tools/javap/JavapPrinter.java?rev=257&view=markup

You need to change the line that says:

out.println(fields[f].getType()+" " +fields[f].getName()+";");

To say:

int cpx = fields[f].getConstantValueIndex();
if (cpx == 0) out.println(fields[f].getType()+" " +fields[f].getName()+";");
else out.println(fields[f].getType()+" " +fields[f].getName()+" = "+cls.getCpoolEntryobj(cpx)+";");

That adds the value of any compile-time constants to the output. I'm not
sure if it will also add values to instance fields which have initial
values; i think those are handled in the constructors, rather than as
ConstantValue attributes.

You could then compare the output of javap, or hashes of that output, to
determine if the interface of the class had changed. If it hasn't, then
any changes to the class file are inconsequential in terms of
recompilation.

Unless i've missed something. Notably absent from javap output is
annotations - can the annotations on a class affect compilation of other
classes which refer to it? @Override only affects the declaring class.
@Deprecated could affect another class, but could only cause a warning to
be generated.

> Perhaps I'll make it a "tighter fit" later, though honestly I'm still
> fumbling around in the dark at the moment, still learning.

PROTIP: that phase never actually ends.

tom

--
coincidences, body modification, hungarian voice sebestyen marta, **
From: Tom Anderson on
On Sun, 9 May 2010, Joshua Maurice wrote:

> On May 9, 3:23�am, Tom Anderson <t...(a)urchin.earth.li> wrote:
>> On Sat, 8 May 2010, Joshua Maurice wrote:
>>> On May 7, 9:08�pm, "Mike Schilling" <mscottschill...(a)hotmail.com>
>>> wrote:
>>>> Joshua Maurice wrote:
>>>>> If anyone else cares, I managed to inadvertently stumble across a
>>>>> solution. On impulse, I asked a co-worker at lunch. It seems that
>>>>> class files do not contain sufficient information with default javac
>>>>> options. However, when compiled with -g, it contains a listing of all
>>>>> types used in the compile.
>>
>> Are you sure?
>>
>> $ javac -version
>> javac 1.6.0_16
>> $ echo "class Foo {public static final int X=23;}" >Foo.java
>> $ echo "class Bar {public static final int Y=Foo.X;}" >Bar.java
>> $ javac -g Foo.java Bar.java
>> $ grep Foo Bar.class
>> $
>>
>> I can see no sign of Bar.class containing any mention of Foo.
>
> Apparently I am mistaken. I would suggest looking for "Bar" and not
> "Bar.class", but the result is the same. static finals might be an
> exception to the debug information, which is sad. I'm wondering how
> I'll work around this now.

I wonder how hard it would be to modify javac to add an attribute to
classes to record the origins of any inlined constants. That would let you
pull the information out from the class file later on, just as you can
already do with all the other types of dependency.

A bit of a poke around in the javac source code suggests it already has
code for tracking dependencies, which is there to support JWS in some way.
There are some flags that should switch it on (-xdepend, -Xdepend, -Xjws),
but they don't work on the version i have installled. There's also some
flag you can set on the compilation environment object that will make it
print dependencies, so if you're driving compilation from code, you should
be able to set that. You'd have to parse the compiler output, but that's
not that bad.

> I am still in the process of implementing, so I haven't really been able
> to test,

You're doing it wrong. Test first, test incrementally, build things in a
way you can test as you go. Before you start work for real, do a
higher-level test, a 'spike solution', to make sure that all your
assumptions (like this one) are valid.

> Hopefully this is the only such corner case. I need more tests.

Always true!

Since you have this vast and terrifying codebase, you can probably
generate some pretty thorough functional tests from it. Take pairs of
adjacent revisions from source control, do full builds on each, find the
differences in output, then apply your tool and see if it comes up with
the right answers. You should be able to automate the process of turning a
pair of revisions into a test suite, and then you can just leave it to
crank away generating them for a few days.

tom

--
coincidences, body modification, hungarian voice sebestyen marta, **
From: Mike Schilling on
Joshua Maurice wrote:
> On May 9, 1:31 am, "Mike Schilling" <mscottschill...(a)hotmail.com>
> wrote:
>> Joshua Maurice wrote:
>>> On May 7, 9:08 pm, "Mike Schilling" <mscottschill...(a)hotmail.com>
>>> wrote:
>>>> Joshua Maurice wrote:
>>>>> If anyone else cares, I managed to inadvertently stumble across a
>>>>> solution. On impulse, I asked a co-worker at lunch. It seems that
>>>>> class files do not contain sufficient information with default
>>>>> javac options. However, when compiled with -g, it contains a
>>>>> listing of all types used in the compile. When combined with Ghost
>>>>> Dependencies, I think this can result in a correct incremental
>>>>> build at the file level which will not cascade endlessly
>>>>> downstream. I'm working on the finishing touches to my prototype
>>>>> now.
>>
>>>> You realize that you're now going to recompile a class when it
>>>> refers to another class to which a comment was added.
>>
>>> Yes. I'm pretty sure that it would be better than doing a full clean
>>> build or a cascading jar-dir-unit incremental build.
>>
>> No doubt, but the result isn't the minimal amount of recompilation
>> we were discussing earlier.
>
> I'm not sure what this minimal recompile which we were discussing is.
> It is technically impossible to do a true minimal recompile
> algorithmically. Let's define it as "Let a build be a set of file
> compilations. Let the minimum recompile be the minimum such set for
> which the output class file are equivalent to the class files of a
> full clean build." First, we'd have to prove such a minimum exists.
> That's relatively straightforward. With that out of the way, I think I
> could then prove that the problem is equivalent to the Halting
> problem. If you define "equivalent" generously, I'm pretty sure this
> is the case. If you define it as "same binary file content", then
> perhaps not, though still possibly yes.

Why would you think that? The ways in which a change to A can affect B is
finite and well-defined.


From: EJP on
On 10/05/2010 1:26 AM, Tom Anderson wrote:

Doesn't/didn't 'jikes' do all this?