From: Lew on
Peter Duniho wrote:
> The .NET community has been very well-served by Red Gate's Reflector
> utility, which does a wonderful job of disassembling .NET programs. No
> one has tried to sue or prosecute them, nor do I think it likely anyone
> would.
>
> I doubt that a Java disassembler would be at any greater risk for same.

Especially given that "javap" comes standard with the JDK as part of the free
tools suite.

--
Lew
From: BGB / cr88192 on

"Joshua Cranmer" <Pidgeot18(a)verizon.invalid> wrote in message
news:i0tfnc$le7$1(a)news-int.gatech.edu...
> On 07/05/2010 03:43 PM, BGB / cr88192 wrote:
>> native disassembly is not *that* difficult, as it is mostly a matter of
>> having:
>
> In the context of disassembly as a prerequisite for decompiling, it can be
> difficult. I will agree that disassembling a small fragment is no
> challenge, but the issue is mostly program-wide decompiling and
> disassembling. Tasks like determining function boundaries and call frames
> I am including in disassembly, and this is not exactly an easy task,
> especially if you compile with -OMG.
>

yeah.

for languages like C and C++, there are some useful hints:
the target of a 'call' is typically the entry point of a function (extra so
if one sees the usual "push ebp; mov ebp, esp" sequence...);
the "ret" is often the end, and is almost assuredly the end so long as there
were not previously conditional jumps past this point...


granted, this is where recursive tracing becomes fairly important...

a simple pass of just trying to scan over the code and disassemble any seen
opcodes, is not likely to produce that much more than garbage (unless the
compiler has been nice, giving only valid instructions and padding with
NOPs).

in some cases, one "might" have to resort to multiple instruction graphs and
using probabilities to weight which graph is more likely to be correct...


>> now, granted, SMC could foul this up, but given SMC is both rare and
>> problematic in modern systems, this is not too much of an issue.
>
> Self-modifying code probably makes up the vast majority of "interesting"
> cases for disassembly: malware.
>

yeah...

but, this is rare in most typical EXE's...

granted, my codebase does make some use of SMC internally...

>>> [1] I'm glossing over a lot of stuff here which is actually quite
>>> difficult for native code, but many of the problems don't exist in Java.
>>
>> large complicated ISA and awkwardness of recursive jump-tracing?...
>
> No need to worry about the pain of code and data sharing the same code
> space (separation of code and data is equivalent to the halting problem)
> is a major factor. Determining function arguments (in light of things like
> fastcall or -fomit-frame-pointer) and even function boundaries is another
> annoying issue. It also helps that Java bytecode is typically unoptimized,
> so you get very sane CFGs.
>

yes, but these are problems for a decompiler, not necessarily the
disassembler phase.
if the disassembler can produce a sane representation of all the
instructions, its work is done...

most of this should work well enough with graph-tracing, and much of the
rest should work acceptably with graph-weighting. it will also generally be
able to skip over most garbage, because unlike a naive linear scan, will not
get caught up trying to disassembler these regions (or, if it does, it will
likely mark them as a low-probability / speculative part of the graph).


> I suppose Java bytecode is roughly comparable to having a binary compiled
> with -g with full debug symbols and no optimization whatsoever, with the
> header files probably also included.
>

yeah, but again, this is not a problem for a plain disassembler, although it
is plenty relevant for decompiling...


>> yeah, probably seems like I am wasting time, but:
>> LLVM is mostly aiming for being a high-performance codegen and code
>> analysis;
>> my main goal is mostly for making high-level features available from C
>> (such
>> as reflection and eval, as well as ability to load scripts, and cleanly
>> integrate between C and high-level scripting languages, ...), which in
>> all
>> deal with a somewhat different set of problem domains...
>
> Reflection and C++ don't mix very well. I could go on for hours about
> this, but by then we'd have long since gone well off-topic.
>

yes, well, it is not ideal...
but, I have made it work well enough (for C at least, but I don't bother
with full C++ support), granted mostly via information mining and writing my
own side-compiler...

technically, one can also use debugger info, but this is not standardized
and not always available, whereas headers are more often available, and
processing the headers can provide much of the info needed to perform
reflection on them...


>> Java also presents its share of interfacing issues...
>
> At least there exists a single Java ABI. C++ on the other hand...
>

fair enough...

x86 gives several common ABIs, and x86-64 several more.
as noted, I have not dealt with full C++, as C++ opens up far more issues
than I am going to deal with for the time being...

luckily, full C++ support isn't needed anyways, as one can require use of
either C or 'extern "C"' for any shared API code, which is usually already
common practice in my case...



From: BGB / cr88192 on

"Peter Duniho" <NpOeStPeAdM(a)NnOwSlPiAnMk.com> wrote in message
news:YtydnaN87tKIo6_RnZ2dnUVZ_rOdnZ2d(a)posted.palinacquisition...
> BGB / cr88192 wrote:
>> [...]
>>> The .NET community has been very well-served by Red Gate's Reflector
>>> utility, which does a wonderful job of disassembling .NET programs. No
>>> one has tried to sue or prosecute them, nor do I think it likely anyone
>>> would.
>>>
>>> I doubt that a Java disassembler would be at any greater risk for same.
>>
>> disassembly is common and fairly non-problematic, mostly due to its fair
>> number of non-infringing uses...
>>
>> now, the matter is decompilers, which themselves present a few more
>> problems on the legal front...
>
> I don't think so.
>
> I was imprecise, as .NET Reflector is both a disassembler (inasmuch as
> Java or .NET byte code are "assembly" languages) and a decompiler
> (Reflector will reconstruct to the best of its impressive abilities any
> managed language version of the MSIL it's analyzing). It has had no real
> legal challenges to its existence or use.
>

fair enough...
then again, it is possibly because this hasn't really come up in court...

the question is not what what has happened thus far, but what would a judge
and jury conclude, and which jurisdictions the plaintif and defendants are
located within...

for example, if if was a US company vs other people in the US (especially if
within the same city or state), they may have a problem.

now, the issue then is if anyone has tried to use it in a means violating
the DMCA.
if they have not (at which point a company can't summon up the police), and
not significantly offended any big company, then no-one is likely to care,
and the matter will not come up (legal or otherwise).

even if a company is annoyed about something, they usually wont act unless
either they are trying to make an example of someone, or they stand to make
money from it (unlikely from individuals, since an individual may not even
be able to pay off the company's lawyer fees, much less make them any
profit). a small company usually will not sue as it would not be worth the
time, effort, and costs involved.


> The fact that the tool displays the low-level byte code as some
> reconstructed higher-level language versus simply a textual representation
> of the byte code itself is essentially irrelevant. In neither case is
> copyright being violated. Simply rearranging, reformatting, redisplaying,
> etc. some copyrighted material that you already have legal access to does
> not in and of itself violate the copyright.
>

yes...


however, the issue is if the tool can be used for committing a crime, and
has any substantial non-infringing use.
this would be the type of argument likely to come up WRT a tool.

realize that it is fairly similar reasoning why gun-control, ..., is so
common in the US as well...
it does not itself commit a crime, but is a tool which can be used in
commiting a crime.
(then one can argue on matters of reasonable uses, such as self-defense and
hunting, ...).


however, often one can get around this issue by offering a disclaimer
against whatever type of infringement.

"we, as company A, do not endorse the use of this tool for violating
copyright or any other laws, and take no responsibility for such acts, ...".
often, making such a disclaimer is sufficient, so long as a non-infringing
use could also be demonstrated (such as the OP's original source-recovery
problem...).

no disclaimer or demonstration means one may be liable if by some chance it
comes up in court.


for example, these sorts of disclaimers allow people running "head shops" in
the US.
it is well known what is the purpose of a head-shop, but they can run
legally due to such a disclaimer (and, yes, one can so totally smoke tobacco
in a bong...).

or such...


From: Arne Vajhøj on
On 05-07-2010 18:00, Mike Schilling wrote:
> "Peter Duniho" <NpOeStPeAdM(a)NnOwSlPiAnMk.com> wrote in message
> news:Ua-dnaBDH6Wsr6_RnZ2dnUVZ_oWdnZ2d(a)posted.palinacquisition...
>> The .NET community has been very well-served by Red Gate's Reflector
>> utility, which does a wonderful job of disassembling .NET programs. No
>> one has tried to sue or prosecute them, nor do I think it likely
>> anyone would.
>
> I've always suspect that Microsoft pays for Reflector and provides
> helpful hints to its developers. That's a lot cheaper than documenting
> all the hidden behavior you currently learn about only from using
> Reflector on the system assemblies.

I don't think so.

MS makes the real source code available under a readonly
type of license for the same purpose.

Arne
From: Peter Duniho on
BGB / cr88192 wrote:
> [...]
>> I was imprecise, as .NET Reflector is both a disassembler (inasmuch as
>> Java or .NET byte code are "assembly" languages) and a decompiler
>> (Reflector will reconstruct to the best of its impressive abilities any
>> managed language version of the MSIL it's analyzing). It has had no real
>> legal challenges to its existence or use.
>>
>
> fair enough...
> then again, it is possibly because this hasn't really come up in court...
>
> the question is not what what has happened thus far, but what would a judge
> and jury conclude, and which jurisdictions the plaintif and defendants are
> located within...

That is always true with any legal question. However, it's fairly
implausible that a tool that doesn't do any decryption applied to data
that hasn't been encrypted would somehow be found in violation of the
DMCA with respect to the "no circumventing encryption" clause.

That said, there are some people who can tolerate no risk whatsoever.
Those people are probably not well-suited to engaging in any creative
business at all, since creative enterprises inherently involve doing
things that no one has done before, and thus for which there is poor or
no legal precedent to look to for guidance.

Fortunately, there are still people who are willing to try something new
and not worry about remote possibilities of legal action or other
unlikely events. And the vast majority of the time, no harm comes to them.

> [...]
>> The fact that the tool displays the low-level byte code as some
>> reconstructed higher-level language versus simply a textual representation
>> of the byte code itself is essentially irrelevant. In neither case is
>> copyright being violated. Simply rearranging, reformatting, redisplaying,
>> etc. some copyrighted material that you already have legal access to does
>> not in and of itself violate the copyright.
>
> yes...
>
>
> however, the issue is if the tool can be used for committing a crime, and
> has any substantial non-infringing use.
> this would be the type of argument likely to come up WRT a tool.

I think it's a pretty far stretch to argue that a decompiler in any
significant way helps someone violate copyright, never mind in a context
that involves encryption. And it _clearly_ has substantial
non-infringing use.

Pete