From: Arne Vajhøj on 4 Jul 2010 19:36 On 04-07-2010 19:19, Joshua Cranmer wrote: > On 07/04/2010 05:16 PM, BGB / cr88192 wrote: >> "Arne Vajh�j"<arne(a)vajhoej.dk> wrote in message >> news:4c30c519$0$281$14726298(a)news.sunsite.dk... >>> 10 years ago people considered it very cool that you could >>> decompile. >>> >>> Today it is old news and only those that for some unusual reason >>> really need the functionality are inteterested. >> >> I suspect part of the issue may be that most practical uses of >> decompilation >> are either questionable (decompiling code for which one doesn't >> legally have >> ownership) or nefarious (the previous, but with intent to either steal >> said >> code, or attempt to circumvent or exploit). > > Researchers have pretty much established that decompilation has > substantial valid uses (supposedly, 20% of all source code just simply > doesn't exist anymore); I myself had to decompile my own code due to an > undiscovered feature in my version control system. Some source code is certainly lost. But my guess is that the Java percentage is lower, because there are not so much 40 year old Java code. And if the source code is lost then there is a decent chance that that it is due to no need for modifying it for a long time, which make it less likely to need to be modified ever. On the other hand sometimes oops'es happen. Like for the OP, like for your VCS feature or for many other reasons (I guess that most people have tried losing source code at some point in time). But I hope those circumstances still qualify for "unusual reason". Arne
From: Arne Vajhøj on 4 Jul 2010 19:38 On 04-07-2010 19:12, Tom Anderson wrote: > On Sun, 4 Jul 2010, Joshua Cranmer wrote: >> In any case, interest in decompiling has significantly waned over the >> past decade or so. A project or two on sourceforge claim to support >> Java 5 decompilation, but I haven't tested it in depth. > > I wonder if the driver of the fall of decompilation is the rise of open > source, and perhaps also open standards. If your landscape consists of, > say, the JDK, JBoss, Spring, and Hibernate, then there are easier and > more reliable ways to get hold of source code than decompilation. If 50% of all Java code is open source then it should reduce the need by 50%. Unless one does as the OP and modify the open source code and lose the modifications. Arne
From: Joshua Cranmer on 4 Jul 2010 19:55 On 07/04/2010 07:12 PM, Tom Anderson wrote: > On Sun, 4 Jul 2010, Joshua Cranmer wrote: >> In any case, interest in decompiling has significantly waned over the >> past decade or so. A project or two on sourceforge claim to support >> Java 5 decompilation, but I haven't tested it in depth. > > I wonder if the driver of the fall of decompilation is the rise of open > source, and perhaps also open standards. If your landscape consists of, > say, the JDK, JBoss, Spring, and Hibernate, then there are easier and > more reliable ways to get hold of source code than decompilation. I think a better explanation is that it was never really a widespread avenue of research to begin with. Academically, it consists of disassembly [1], control structure identification, and typing and variable analysis. The middle part is pretty much a solved problem, and I'm reasonably sure that the type/variable analysis is also pretty well solved. Disassembly has, by and large, remained generally difficult for native code, but great strides have been made in the last 20 years or so. Since Java bytecode doesn't mash data and code together in the same space, and given how much of the structure information is left in the bytecode, it induced a massive spurt in decompilers because it was easy to decompile. I'm guessing this spurt was more of a proof-of-concept than a full-blown branching out. Since fully automated disassembly is the most unsolved portion of decompiling, Java is academically uninteresting to decompile; furthermore, you don't need to go the full decompiler route to showcase improvements in disassembler. On top of all of this, one of the major problem classes for reverse engineering in general is dealing with malware, which mostly exists in native code and not bytecode languages. You can see that there are a handful of decompilers, defunct or otherwise, for other bytecodes (I know of two or three for both Python and .NET); the only two languages which have a large number of decompilers are Java (because it was easier) and C (because it was harder). In short, academically, Java decompilers are effectively solved, but maintaining an up-to-date decompiler for Java (or any other bytecode language) is not something many people wish to do. This has probably been true since before Java was created: the lack of modern decompilers is probably more attributable to an abnormal interest generated by Java being the first major bytecode language in existence. For an open source project to survive, it needs a critical threshold of developers. The Java decompiler market is already crowded with several "good enough" solutions, C decompilers are effectively beyond the start of the art [2], and the interest for other markets is generally insufficient to sustain even a small operation. Perhaps a tool which could become the "gcc" of decompilers (able to go from many source architectures to many destination languages) might achieve this threshold. But unless a tool achieves substantially better results, it is probably not going to be successful as a project. [1] I'm glossing over a lot of stuff here which is actually quite difficult for native code, but many of the problems don't exist in Java. [2] In the sense of fully-automated decompilation. x86 disassembly is a royal pain in the butt; while there exist tools that can do this well (IDA!), I'm not aware of anything that could be used in open-source software [3]. [3] On reflection, I suppose LLVM is utilizing its x86 assembly architecture for disassembly (for debugging purposes). -- Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald E. Knuth
From: Lew on 4 Jul 2010 20:36 Tom Anderson wrote: >> I wonder if the driver of the fall of decompilation is the rise of open >> source, and perhaps also open standards. If your landscape consists of, >> say, the JDK, JBoss, Spring, and Hibernate, then there are easier and >> more reliable ways to get hold of source code than decompilation. Arne Vajhøj wrote: > If 50% of all Java code is open source then it should reduce > the need by 50%. > > Unless one does as the OP and modify the open source code > and lose the modifications. I wonder what those modifications to the OP's "large JAR files such as Hibernate" were. It's hard to imagine that the local variations that carelessly were not maintained in a source-control system should be so extensive or so necessary that one could not abandon them altogether, or do as I've had to do on occasion in my career and recapitulate them from spec. If the modifications were so all-fired important then the modifiers were criminally negligent not to preserve their source. I advise the OP to abandon his dependency on them and go to the canonical versions of those "large JAR files" (really, libraries of classes - confusing classes with files is all too common a mistake). -- Lew
From: Arved Sandstrom on 5 Jul 2010 05:47
Lew wrote: [ SNIP ] > As for justification to rewrite parts of Hibernate, I am at best > skeptical. Hibernate is a robust and rather complete set of libraries. > I have to wonder what changes it required that would not have been > better served by writing libraries or client code extrinsic to the > Hibernate libraries themselves. OP? What raises my suspicions even > further is that the rewrites were performed by people who didn't have > the wisdom to protect their code against loss. In my experience permitted customizations (through inheritance or interface implementation), along with the occasional application of known patches (that may be officially available only for later versions than what you've got deployed) sometimes end up bundled in the original JARs. Throw into the mix a reluctance to put third-party source under version control (*), and possibly a hot-fix system with inadequate tracking. And you can easily then end up with what I call "mystery" or "magic" JARs...they're labelled as a well-known JAR but aren't in fact quite the same thing. It's more bad configuration management than it is strictly bad VC. What happens with mystery JARS is that as soon as the original developers are no longer actively involved, the prime directive is to carefully preserve them. Knowledge of how to _make_ them is lost. Experienced developers of later generations know when they have the mystery JARs by looking at the filesizes, the secret of using these JARs is passed down carefully, and nobody is willing to rip them out because the knowledge of what they do differently is vanished into the mists of time. I agree with you that the best thing to do in this scenario is to go back to the stock JARs and simply deal with what breaks. Frequently _nothing_ breaks because all of your other stuff has moved on and the custom stuff is not being used. AHS * I've seen this time and time again: a strong aversion to making in-house mods to open source; hence no willingness to have third-party open source in one's own version control. There is absolutely no hesitation in using third party OSS though. And yet there are no qualms about letting your developers hack your *own* code beyond recognition (and effective documentation and design). It's like saying that the developers of the third-party code knew what they were doing but your own developers don't. -- Without requirements or design, programming is the art of adding bugs to an empty text file. -- Louis Srygley |