From: Oliver Betz on
Hello Don,

[...]

>>> I am leary of encryption as, historically, it has always managed
>>
>> You mean "leery"? So you are biased due to lack of knowledge?
>
>On the contrary, I've used encryption successfully in past
>projects. I've also used obfuscation. The above statement
>merely addresses the reality of encryption techniques
>failing as technology improves. In the 1970's, crypt(1)
>was stong enough to deter deliberate password attacks.
>Nowadays, you can break an even more secure password
>in minutes using hardware you buy at your local department
>store.

I didn't expect the watermarking you are looking for has to resist for
decades. As far as I understand, the goal is to identify preproduction
samples.

[...]

> "Anyone with FIRST HAND experience deploying watermarking
> technologies? Comments re: static vs dynamic techniques?"
>
>But, as is so often the case, instead of answering the
>question *asked*, folks reply with "Why do you want to
>do *that*? Why don't you do THIS instead?"

I gave _exactly_ a solution for your request, and I have first hand
experience (besides I didn't yet embed any individual IDs, but that's
trivial).

>As I've said in the past, "assume I am competent". Do you
>require *your* employers to justify each decision *they*
>make when it is handed down to you?

Why should they "justify"? But I even dare to tell our customers if
they want something different from their needs or if their decision
might be made on wrong assumptions.

Oliver
--
Oliver Betz, Munich
despammed.com might be broken, use Reply-To:
From: D Yuniskis on
Hi Oliver,

Oliver Betz wrote:
>>>>> yes, I don't use any "watermarking".
>>>> OK. So, if someone copied one of your devices, you would have
>>>> no way of knowing which particular device was the source of
>>>> the copy (?)
>>> no, I don't care. Nobody did so till now and there is also some other
>>> specific knowledge he needed to be successful.
>> I'm not claiming that you *do* care. Rather, I am trying
>> to illustrate the different "problem" you are trying to
>> address. *I* am trying to identify *which* of N "copies"
>> of a device served as the genesis for a counterfeit product.
>
> no need to repeat, you already explained that.
>
> [...]
>
>> I'm not interested in discussing counterfeiting techniques.
>
> [...]
>
>> For a bit of perspective, try a read of Huang's
>> _Hacking the Xbox: An Introduction to Reverse Engineering_
>
> not really applicable, as I wrote.

My point was to see what *can* be done "informally" by a
"casual hacker" (not a well-funder adversary)

>> I am leary of encryption as, historically, it has always managed
>
> You mean "leery"? So you are biased due to lack of knowledge?

On the contrary, I've used encryption successfully in past
projects. I've also used obfuscation. The above statement
merely addresses the reality of encryption techniques
failing as technology improves. In the 1970's, crypt(1)
was stong enough to deter deliberate password attacks.
Nowadays, you can break an even more secure password
in minutes using hardware you buy at your local department
store.

> In the end, it's a cost/benefit question.

Sure.

> But thie discussion develops similar to other ones you started in the
> recent past. It seems you already decided what to do and you don't
> consider other methods impartially.

Um, I *do* have contractual obligations. You know, "bosses". :>
If I am hired to do X, I can't very well say, "No, I am going
to do Y instead."

I clearly stated:

"Anyone with FIRST HAND experience deploying watermarking
technologies? Comments re: static vs dynamic techniques?"

But, as is so often the case, instead of answering the
question *asked*, folks reply with "Why do you want to
do *that*? Why don't you do THIS instead?"

As I've said in the past, "assume I am competent". Do you
require *your* employers to justify each decision *they*
make when it is handed down to you?

It would be just as easy for me to assume *respondents* were
NOT competent as they were unable to answer straightforward
questions, right?

I think I have been patient in trying to explain some of the
other concerns that may have motivated these "decisions"
(*before* they were "handed down" to me) so folks can
accept that others *may* have a different outlook on how
and why things are done. As I also stated in my OP, I've
already had first-hand experiences with watermarking and
was acknowledging that technologies FOR WATERMARKING will
have changed in the years since. So, I'm not "fishing"
for information regarding something I've no experience with.
From: Nobody on
On Thu, 15 Apr 2010 19:30:36 -0400, George Neuner wrote:

> I've read about "functional marking", changing the program's runtime
> behavior based on some key ... for example, strobing indicator lights
> in a pattern that's hard to discern by eye but could be clearly seen
> using a movie camera. But I don't see an easy way to do something
> that would remain non-obvious if the program were disassembled.

This gets around an issue with equivalent code sequences, namely that
decompiling then recompiling with optimisation will tend to eliminate any
watermarks.

If the watermark affects the code's observable behaviour in any way,
then it would have to be preserved by any "equivalent" code. Removing the
watermark would require understanding the code to the extent that it could
be modified such that the modified behaviour was merely "close enough"
rather than identical.

From: whygee on
D Yuniskis wrote:
> Ideally, the modified code has no observable differences -- other
> than the actual memory image (at *run* time).
I've just been thinking about another trick :
chaffing and flaring in the padding/alignment portions of
the code... Generate stupid and randomized sequences of instructions
that call and jump like mad in the "dead" portion of the binary...
There is no use if the attacker traces/probes the actual
execution, but static analysis will get crazy AND it provides
quite some room for pseudo-random watermarks
(which is the original intent anyway)

With GCC, the padding/alignment is managed
at the ASM level so one has just to modify/patch the assembler.
It does not work against the compiler.

regards,
yg
--
http://ygdes.com / http://yasep.org
From: George Neuner on
On Sun, 18 Apr 2010 14:28:59 -0700, D Yuniskis
<not.going.to.be(a)seen.com> wrote:

>Hi George,
>
>George Neuner wrote:
>> On Fri, 16 Apr 2010 11:08:33 -0700, D Yuniskis
>> <not.going.to.be(a)seen.com> wrote:
>>
>>> What you want is something that an observer with two (or
>>> more) instances (avoiding the term "copy") of an executable
>>> will recognize as "different" -- but, won't be able to easily
>>> figure out how to convert either of them into "yet another"
>>> instance that retains all of the original functionality.
>>
>> That is a *very* different thing than watermarking ... watermarking is
>> simply a scheme to identify the source of an item that may be
>> counterfeited.
>
>Watermarking is of little use if the watermark can be
>easily identified and removed/altered. Indeed, it would
>be trivial to just embed "This is copy #027 of the product"
>in each particular instance.

I understand, but I still say what you are looking for is not really
"watermarking" ... it sounds more like you're looking for some kind of
obfuscation scheme that will rearrange code in ways that are hard to
compare.



>> if you start with a well distributed customer id
>> (say a crypto hash of the customer info) which is 160..256 bits long
>> and only patch locations corresponding to '1' bits in the hash,
>> a counterfeiter would need many samples of patched executables to
>
>But, the counterfeiter can identify these types of
>transforms from as few as two copies and, changing just *one*,
>has now succeeded in changing the watermark! Depending on the
>Hamming distance between "successive (adjacent?)" watermarks,
>this can be enough to argue (in a court of law) that the
>device bearing the altered watermark was not, in fact,
>derived from the device having the unaltered watermark
>visible in the "defendant's" instance of the product.
>
>I.e. the problem is similar to that of authentication
>with digital signatures. You don't want a party to be
>able to forge someone else's signature; NOR BE ABLE TO
>DISAVOW THEIR OWN!

I think you're being too literal ... for most ISAs there are a lot of
things that can be done to an executable that will change the binary
but won't change the function. If you identify a bunch of possible
substitution points then you mix up what you do.

Remember that you said previously that you aren't trying to defend
against disassembly. Just diff'ing executables doesn't tell you what
the differences mean but only that they exist.



>Revisiting an example I posed previously, consider:
>
>int
>foo(...) {
> int A;
> int B;
> int C;
> <body>
>}
>
>vs.
>
>int
>foo(...) {
> int B;
> int C;
> int A;
> <body>
>}
>
>I.e., the two functions will behave the same regardless
>of the contents of <body>, correct? (barring code that
>is self-examining or self modifying)
>
>One could create M4 macros to wrap around each of these
>declarations such that you could "externally" massage
>the sources to effectively reorder the declarations. Right?
>(left as an exercise for the reader)
>
>A compiler *could* rearrange the stack frame to essentially
>rewrite the first instance of foo to be *identical* to
>the second (I believe this is allowed under a strictly
>conforming C compiler).
>
>But, what if the declarations were:
>
>int
>foo(...) {
> int array[3];
> <body>
>}
>
>#define A (array[0])
>#define B (array[1])
>#define C (array[2])
>
>permuted (for a different watermark) to:
>
>#define A (array[2])
>#define B (array[0])
>#define C (array[1])
>
>I suppose a compiler could notice the invariant nature
>of the individual references and, IF THE array IS ENTIRELY
>"LOCAL", rearrange them (though it is hard to see why the
>compiler would *want* to do so... what is *gained* in such
>an optimization?)
>
>The resulting binaries would run in roughly identical time.
>Their results would be identical. Yet, the would be different
>binaries. "Watermarked" uniquely.
>
>This sort of source level translation would be easy to
>test (without having to develop a tool that automagically
>rewrote each function definition "conditionally").

That actually ties in with something I thought of. I was thinking
about whygee's idea of abusing data and it occurred to me that
indirect function calls are perfect for abuse in that way.

Imagine something like the following:


char index2[] =
{
1,
0,
};
char index1[] =
{
1,
0,
};

(int (*)(int)) F1 = (int (*)(int))(func_addr(index2[index1[0]]));
(void (*)(int)) F2 = (void (*)(int))(func_addr(index2[index1[1]]));


void* func_addr( int index )
{
void *ptr[] =
{
func_1,
func_2,
};
return ptr[index];
}


int main( void )
{
(*F2( (*F1)(42) );
return 0;
}


The index tables can be rearranged at will - the actual function
pointers don't change and are hidden in func_addr(). Using 2 stages
is obfuscation - more things changed in a diff of two binaries - with
the added benefit that the two tables must be synchronized for the
program to work.

The indexes are declared as characters so that differences between
"marked" binaries appear to be some kind of ID string - how the values
are really used is not obvious unless the program is disassembled. The
actual indirection pointers, F1, F2, etc. are not initialized until
the program runs - so they should appear identical in a diff of 2
customized binaries.

Obviously this scheme is viable only if you can add enough functions
so that the indexing tables are fairly long. And if this is the only
"marking", you are obviously limited to the number of combinations
made possible by the length of one table (because the 2nd table is
linked). And, also obviously, in an RT situation, you may not want to
call functions indirectly.

This scheme has the secondary benefit that the index tables could be
modified after compilation using a patch tool. I still think it is a
bad idea to use conditional compilation.

In any event, I thought this might give you some ideas.

George