From: D Yuniskis on
Hi George,

George Neuner wrote:
> On Sun, 18 Apr 2010 14:28:59 -0700, D Yuniskis
> <not.going.to.be(a)seen.com> wrote:
>
>> George Neuner wrote:
>>> On Fri, 16 Apr 2010 11:08:33 -0700, D Yuniskis
>>> <not.going.to.be(a)seen.com> wrote:
>>>
>>>> What you want is something that an observer with two (or
>>>> more) instances (avoiding the term "copy") of an executable
>>>> will recognize as "different" -- but, won't be able to easily
>>>> figure out how to convert either of them into "yet another"
>>>> instance that retains all of the original functionality.
>>> That is a *very* different thing than watermarking ... watermarking is
>>> simply a scheme to identify the source of an item that may be
>>> counterfeited.
>> Watermarking is of little use if the watermark can be
>> easily identified and removed/altered. Indeed, it would
>> be trivial to just embed "This is copy #027 of the product"
>> in each particular instance.
>
> I understand, but I still say what you are looking for is not really
> "watermarking" ... it sounds more like you're looking for some kind of
> obfuscation scheme that will rearrange code in ways that are hard to
> compare.

Well, I don't care if it's "code", per se. I just want to
be able to uniquely identify which "instance" of a "device"
was used in producing a (counterfeit) copy. You can rearrange
data, rearrange modules (in the code), change representations
of data *or* code, etc. You just want to get "nearly identical"
performance/operation from "(ideally) WILDLY DISSIMILAR" images.

But, I am acknowledging the fact that a thief would be able
to identify and isolate "trivial" changes to devices. E.g.,
the "This is copy #027 of the image" approach would be
easily recognized (compare two images, look for differences).
Once identified, it would be easy to modify the "watermark" to:
- something totally bogus (untraceable)
as in "This is copy #ABC of the image"
- something that (intentionally or unintentionally)
makes it look as if the image was from some *other*
image as in "This is copy #029 of the image"

I.e., in the former case, the thief chuckles and you're
left wondering where the source came from. And, in the
latter case, the source of the original image is not only
"lost", but, some other party is implicated as the thief
(intentionally or otherwise).

If this were possible, accusing *any* party of being the
source of the theft could easily be defended against
(in a court of law) by simply demonstrating how easily
(the Defense) could alter the watermark to indicate
yet *another* party -- "This is copy #030 of the image"

I liken it to the problem presented by digital signatures
in that it is difficult to forge which both *protects*
the original party's signature ("You can't hold me
accountable for something that *I* didn't sign") as well
as preventing the original party from *denying* their
signature ("That's not *my* signature!" "Oh, really?
We've already proven it's not possible to *forge* one...")

Note that you can add explicit *or* surreptitious tests
to your code to detect if the image has been tampered
with. But, this is extra effort that doesn't contribute
to the functionality of the device. It also tends to
be easy to spot when this sort of thing is going on in
a device -- *especially* if you do something stupid
with the information like: "Image corrupted. Halting."

(There are clever ways of doing this sort of thing that
make detection harder; but, they still add effort to
the design process that has no direct contribution
to the device being produced)

>>> if you start with a well distributed customer id
>>> (say a crypto hash of the customer info) which is 160..256 bits long
>>> and only patch locations corresponding to '1' bits in the hash,
>>> a counterfeiter would need many samples of patched executables to
>> But, the counterfeiter can identify these types of
>> transforms from as few as two copies and, changing just *one*,
>> has now succeeded in changing the watermark! Depending on the
>> Hamming distance between "successive (adjacent?)" watermarks,
>> this can be enough to argue (in a court of law) that the
>> device bearing the altered watermark was not, in fact,
>> derived from the device having the unaltered watermark
>> visible in the "defendant's" instance of the product.
>>
>> I.e. the problem is similar to that of authentication
>> with digital signatures. You don't want a party to be
>> able to forge someone else's signature; NOR BE ABLE TO
>> DISAVOW THEIR OWN!
>
> I think you're being too literal ... for most ISAs there are a lot of
> things that can be done to an executable that will change the binary
> but won't change the function. If you identify a bunch of possible
> substitution points then you mix up what you do.
>
> Remember that you said previously that you aren't trying to defend
> against disassembly. Just diff'ing executables doesn't tell you what
> the differences mean but only that they exist.

Exactly. With two or more "images" (I am trying to be careful
to avoid using "copy" as copy, in this case, implies a literal
copy!), you can tell that *something* has been done to
"make them unique". That could either be two different
versions of the software (i.e., the two devices are not
intended to perform identically) *or* that they are trying
to "be unique" with the same functionality.

If the differences are nontrivial, then a thief is forced to
understand what is actually going on "under the hood" in
order to be able to disguise the original source of the
image *or* masquerade as some *other* original image source.
I.e., it makes it harder for the identity of the original source
to be obfuscated. It also removes the *accused* source from
claiming that "Heck, anybody could have PLANTED my identity
in that device!"

For example, you can sort a list from top to bottom
in ascending order; *or*, from bottom to top in
*descending* order. The results are the same. And,
if there is no bias to the original ordering in the
(unsorted) list, the operation will take the same
amount of time and consume the same amount of
resources.

But, this *looks* like it would force the developer to
write two different versions of a "sort()" -- which
would burden his development effort just to support this
"mechanism".

*However*, you could simply create different versions of
standard library routines and use Version 1 or Version 2
in a particular "linked image". E.g., memset() that
works from the start to the end vs. the end to the start.
Furthermore, you could include both versions of the routine
and use one in some places and the other in other places
(i.e., exploit things that the developer technically doesn't
have any control over to encode information in the binary)

>> Revisiting an example I posed previously, consider:
>>
>> int foo(...) { int A, B, C; <body> }
>>
>> vs.
>>
>> int foo(...) { int B, C, A; <body> }
>>
>> I.e., the two functions will behave the same regardless
>> of the contents of <body>, correct? (barring code that
>> is self-examining or self modifying)
>>
>> One could create M4 macros to wrap around each of these
>> declarations such that you could "externally" massage
>> the sources to effectively reorder the declarations. Right?
>> (left as an exercise for the reader)
>>
>> A compiler *could* rearrange the stack frame to essentially
>> rewrite the first instance of foo to be *identical* to
>> the second (I believe this is allowed under a strictly
>> conforming C compiler).
>>
>> But, what if the declarations were:
>>
>> int foo(...) { int array[3]; <body> }
>>
>> #define A (array[0])
>> #define B (array[1])
>> #define C (array[2])
>>
>> permuted (for a different watermark) to:
>>
>> #define A (array[2])
>> #define B (array[0])
>> #define C (array[1])
>>
>> I suppose a compiler could notice the invariant nature
>> of the individual references and, IF THE array IS ENTIRELY
>> "LOCAL", rearrange them (though it is hard to see why the
>> compiler would *want* to do so... what is *gained* in such
>> an optimization?)
>>
>> The resulting binaries would run in roughly identical time.
>> Their results would be identical. Yet, the would be different
>> binaries. "Watermarked" uniquely.
>>
>> This sort of source level translation would be easy to
>> test (without having to develop a tool that automagically
>> rewrote each function definition "conditionally").
>
> That actually ties in with something I thought of. I was thinking
> about whygee's idea of abusing data and it occurred to me that
> indirect function calls are perfect for abuse in that way.
>
> Imagine something like the following:
>
> char index2[] = {1, 0, };
> char index1[] = {1, 0, };
>
> (int (*)(int)) F1 = (int (*)(int))(func_addr(index2[index1[0]]));
> (void (*)(int)) F2 = (void (*)(int))(func_addr(index2[index1[1]]));
>
> void* func_addr( int index )
> {
> void *ptr[] =
> {
> func_1,
> func_2,
> };
> return ptr[index];
> }
>
> int main( void )
> {
> (*F2( (*F1)(42) );
> return 0;
> }

Ouch! Yes, that sort of thing would work! (I've not nitpicked
your syntax -- I'll assume it's what I think it to be :> )
The problem (that I see currently) is that index1[] and index2[]
(likewise ptr[]) are very visible (though I acknowledge that
this is undoubtedly just to make the *idea* more obvious)

> The index tables can be rearranged at will - the actual function
> pointers don't change and are hidden in func_addr(). Using 2 stages
> is obfuscation - more things changed in a diff of two binaries - with
> the added benefit that the two tables must be synchronized for the
> program to work.
>
> The indexes are declared as characters so that differences between
> "marked" binaries appear to be some kind of ID string - how the values
> are really used is not obvious unless the program is disassembled. The
> actual indirection pointers, F1, F2, etc. are not initialized until
> the program runs - so they should appear identical in a diff of 2
> customized binaries.
>
> Obviously this scheme is viable only if you can add enough functions
> so that the indexing tables are fairly long. And if this is the only
> "marking", you are obviously limited to the number of combinations
> made possible by the length of one table (because the 2nd table is
> linked). And, also obviously, in an RT situation, you may not want to
> call functions indirectly.
>
> This scheme has the secondary benefit that the index tables could be
> modified after compilation using a patch tool. I still think it is a
> bad idea to use conditional compilation.

I was only showing how I can use macros to experiment with
various approaches. I.e., use the macros to let me
effectively rewrite the sources without actually rewriting
them (lazy? error prone?? :> ). So, I could quickly
create N different variants of an "identical" program and
then examine the stripped executables to verify that the
results were actually "different" and the behavior *identical*.

This save the effort of writing a tool to do the work
(only to discover that it might have obvious failings?)

> In any event, I thought this might give you some ideas.

One downside is that (the example) forces a coding style on the
developer. I.e., you would have to come up with a way to
extract this sort of "information" from his/her "natural"
way of doing things.

E.g., in C++, you could probably play games with the definitions
of the vtable's without the developer ever needing to know
that is happening (or, *will* happen after code "release").
I'll have to think if there are other cases (e.g., in C)
where you can do this "transparently". There may be other
aspects of the design that are more readily suited to this
sort of exploit (though that makes it less "universal")
From: D Yuniskis on
D Yuniskis wrote:
> Anyone with FIRST HAND experience deploying watermarking
> technologies? Comments re: static vs dynamic techniques?
>
> Also, any FIRST HAND (not hearsay) experiences with its use
> in litigation -- that you can *discuss* openly? Or, pointers
> to cases of interest? I only know (first hand) of two and
> technology has changed considerably in the years since...

Perhaps a timely coincidence? ;-)

http://latimesblogs.latimes.com/technology/2010/04/gizmodo-iphone-4g.html
From: D Yuniskis on
whygee wrote:
> D Yuniskis wrote:
>> Ideally, the modified code has no observable differences -- other
>> than the actual memory image (at *run* time).

> I've just been thinking about another trick :
> chaffing and flaring in the padding/alignment portions of
> the code... Generate stupid and randomized sequences of instructions
> that call and jump like mad in the "dead" portion of the binary...

<frown> I see your point. But, I think it would be easy
to identify that portion as being "never accessed" (?)
I.e., it would be comparable to embedding a text string
in the binary -- that is never *referenced* (e.g., RCS
info)

> There is no use if the attacker traces/probes the actual
> execution, but static analysis will get crazy AND it provides
> quite some room for pseudo-random watermarks
> (which is the original intent anyway)

The Unisite's code image (on disk) *looks* like this.
Until youo realize it goes through a custom disk controller
before it gets to the processor (and memory). So, you just
look at the image *after* it is "in memory".

I.e., unless you have a secure processor, this is all you need
to analyze.

> With GCC, the padding/alignment is managed
> at the ASM level so one has just to modify/patch the assembler.
> It does not work against the compiler.
From: George Neuner on
On Tue, 20 Apr 2010 15:42:13 -0700, D Yuniskis
<not.going.to.be(a)seen.com> wrote:

>George Neuner wrote:
>>
>> Imagine something like the following:
>>
>> char index2[] = {1, 0, };
>> char index1[] = {1, 0, };
>>
>> (int (*)(int)) F1 = (int (*)(int))(func_addr(index2[index1[0]]));
>> (void (*)(int)) F2 = (void (*)(int))(func_addr(index2[index1[1]]));
>>
>> void* func_addr( int index )
>> {
>> void *ptr[] =
>> {
>> func_1,
>> func_2,
>> };
>> return ptr[index];
>> }
>>
>> int main( void )
>> {
>> (*F2( (*F1)(42) );
>> return 0;
>> }
>
>Ouch! Yes, that sort of thing would work! (I've not nitpicked
>your syntax -- I'll assume it's what I think it to be :> )

It was copied out of a working program, but I did forget to copy the
forward declarations.


>The problem (that I see currently) is that index1[] and index2[]
>(likewise ptr[]) are very visible (though I acknowledge that
>this is undoubtedly just to make the *idea* more obvious)

Yes, the whole point is that index1[] and index2[] will light up in
diffs of branded executables. The idea is to make a counterfeiter
think those index arrays are some kind of customer ID string ...
that's why they are defined as chars. But when the counterfeiter
starts messing with the string, the program (hopefully) stops working.

Other structures, like the ptr[] table, will be identical in all
instances - they might be identifiable in a dump from a structured
executable (coff, elf, etc.), but if you use an a.out structure, all
the data will be anonymous.

Given enough working samples, someone might notice the pair
relationship between the indexes. However, they can be combined into
a single 2-D array that will interleave their elements in a diff
listing and help hide the relationships.

char index2[] = {1, 0, };
char index1[] = {1, 0, };
(int (*)(int)) F1 = (int (*)(int)) func_addr(index2[index1[0]]);

becomes (I think ...)

char index[][2] = { {1,1,},
{0,0,} };
(int (*)(int)) F1 = (int (*)(int)) func_addr(index[index[0][1]][0]);


index2[] becomes the leftmost column and index1[] the rightmost. You
could do it either way, but because index1[] has it elements in a
fixed order, by putting it last, most of its elements seem to
reference unrelated data in a sequential dump of the structure, so
even if someone guesses that pairs of characters are somehow related,
the actual relationships are shrouded.


>One downside is that (the example) forces a coding style on the
>developer. I.e., you would have to come up with a way to
>extract this sort of "information" from his/her "natural"
>way of doing things.

I haven't tried to clean it up but I believe macros can sufficiently
hide the nasty call syntax and the complicated look-ups.

I initially defined F1, F2, etc. as macros that did the function
look-up on demand, but I settled on using explicit function pointer
variables because the look-up was so heavy (3 array indexes).

The call to func_addr() is completely unnecessary, you can eliminate
it by making the address table a global. I initially chose to use a
function because doing so would place the address table in a different
load segment than the globals. But in any event, the point is to
obfuscate as much as possible and really to force disassembly to
figure out how to bypass the branding.

George
From: George Neuner on
On Tue, 20 Apr 2010 15:42:13 -0700, D Yuniskis
<not.going.to.be(a)seen.com> wrote:

>... in C++, you could probably play games with the definitions
>of the vtable's without the developer ever needing to know
>that is happening (or, *will* happen after code "release").

AFAIK, you can't easily mess with a vtable except in a leaf class. The
order of functions is determined partly by standard (new is 1st,
delete is 2nd, etc.) and partly by the total ordering of definitions
as seen over the entire class hierarchy.

Also, there is no standard vtable implementation - some compilers use
arrays, others use hash tables. And in an array implementation, it's
possible for complicated classes to have more than 1 vtable.

George