From: D Yuniskis on 16 Apr 2010 13:42 Hi Oliver, Oliver Betz wrote: >>>>>> Anyone with FIRST HAND experience deploying watermarking >>>>>> technologies? Comments re: static vs dynamic techniques? >>>>> has your hardware external program memory or do they run from internal >>>>> flash? If the latter, distribute encrypted binaries. >>>> Encrypting tries to *prevent* counterfeiting. That's not the >>>> goal >>> as far as I understand, that's exactly the goal. You don't want >>> someone to change the "watermark". >> Let me rephrase: encryption prevents copying (which would > > It makes it harder to copy the whole device or disassemble the Yes. In my case, I am acknowledging that copying *will* (probably) take place. And, just trying to track where the copies originated. (client's goal) > executables, but we make accessible (encrypted) firmware files freely > to our customers. Therefore... > >> be an added bonus!) The goal here is not to prevent but, >> rather, to *track* where a copy originated. Preventing >> copying is often harder to accomplish; and, is very obvious >> to the potential copier that you have taken measures to >> try to thwart that. OTOH, watermarking need not "announce" >> itself to the thief. He perceives a copyable product. >> He makes his copy and his copy *works*, etc. > > ....that's also true for our devices (unless you mean by "copyable" > that the customer is also allowed to copy the hardware). Think of company that blatantly *steals* your design. (this is far more commonplace than you would think!) *Copying* hardware and software are easy if you've a mind to do so. OTOH, *understanding* a design in enough detail to be able to *change* it to something *equivalent* JUST to disguise the fact that you copied some other product requires considerably more effort. (First, you have to *know* this to be the case -- "Why do these N devices all have slightly different firmware images? Are they different versions? If so, which is the "most advanced"? etc.) > [...] > >> OK, so you are watermarking *then* encrypting. But, you still >> need a robust means of watermarking the executable (that can't >> be easily altered, thwarted, etc.) *That's* the issue I am trying >> to address. > > Since the executable is not directly accessible to the customer, I > simply could put a notice e.g. in the boot message. The customer only > sees the encrypted file and the bootloader will not accept it if it's > changed. Presumably (though not a guaranteed fact), all of your images are identical (i.e., the *decrypted* version). I.e., if I picked up 2 of your devices and reverse engineered them, I would see two things identical "under the hood", right? So, if I *copy* the device-as-ready-to-accept-an-encrypted-image, I can use any of your future released images. And, you can't tell *which* particular device I used as the template for my original "copy". The goal here is to acknowledge that copying *is* possible (and affordable). But, to try to track where a "leak" may have occurred during the development/alpha/beta program. E.g., this puts a lot more pressure on those testers as you can now harrass "the leak" *personally* when/if a copy turns up. (I suspect trying to track down sources of copies after formal product release is not a concern. Rather, you would be quite annoyed to find that -- not only has your product been *copied* but the copy is commercially available *before* your "original" is!) [I am making educated guesses, here, as to the actual reasons behind this design criteria :< ]
From: D Yuniskis on 16 Apr 2010 14:08 Hi George, George Neuner wrote: > On Sun, 11 Apr 2010 17:17:15 -0700, D Yuniskis > <not.going.to.be(a)seen.com> wrote: > >> Anyone with FIRST HAND experience deploying watermarking >> technologies? Comments re: static vs dynamic techniques? > > I've been following your conversation with whygee. No watermarking > scheme is foolproof or unforgeable, and marking an executable is > harder because it isn't possible to fuzz code like data. Yes and no. To some extent, data manipulations are *easier* to observe and manipulate. But, code can also be "marked"; after all, its just "data" interpreted by a state machine (known as the CPU). > I've read about "functional marking", changing the program's runtime > behavior based on some key ... for example, strobing indicator lights > in a pattern that's hard to discern by eye but could be clearly seen > using a movie camera. But I don't see an easy way to do something > that would remain non-obvious if the program were disassembled. No, I am not talking about anything that has to be obvious to a "special observer" -- other than an observer that can disassemble (meaning "decompose into small physical pieces") the device in question and compare it to a "template" suspected of being the original for the copy. What you want is something that an observer with two (or more) instances (avoiding the term "copy") of an executable will recognize as "different" -- but, won't be able to easily figure out how to convert either of them into "yet another" instance that retains all of the original functionality. > I agree with whygee that the best way to mark an executable is to > abuse equivalent code sequences. However, I think it should be done > *after* compilation, using a patching tool and starting from the But those are trivial to identify. I.e., they have to occupy the same space in all "incantations". They will tend to be confined to very small pieces of memory (huge code sequences get harder to manipulate while still satisfying any control transfers out/in). And, they will tend to be simple -- analysis will make it readily apparent that the changes are "meaningless": "Here he adds 5; there he subtracts -5. <shrug>" > unaltered binary. I also agree that the patches should be based on > some (well distributed) customer id - perhaps a crypt hash on the > customer information. > > You want to substitute code sequences of equal length (unless PIC code > makes it unnecessary). As an stupid example, it won't do any good to > replace an addition with a subtraction if you then also need to > set/clear a carry bit for the subtract instruction to work. You need > to find a set of substitutions that are safe to make and which won't > affect other code, but it helps that the substitution sequences can be > specific to the binary being marked. But, you see, that is exactly the sort of thing that makes this approach trivial to circumvent. Imagine, for example, compiling each instance with a different compiler and linkage editor. Or, with a different level of optimization. etc. (this won't work, either, because it can make *big* changes to the performance/requirements of the device). I.e., in each case, its the same "product" but the code images look very different. > In many ISAs it does not affect the length of the code sequence to > negate a comparison by reversing the operand order and then to branch > on the negative result. Obviously you don't want to do this just > anywhere because it potentially could mess up a multi-way branch, > however, you can safely negate/reverse any 2-way branch. During > coding, you could manually tag good candidates with a label and then > pick the labels out of a map file or assembler listing of your release > compile. <frown> I don't see this as a realistic way forward. It puts too much burden on the developer. And, doing it as a post process means the tools *developed* to do it would be complex -- would they introduce bugs, etc. I think, for a given, known toolchain, you could get the results I need just by clever manipulations of the sources -- *without* the participation or consent of the developer (i.e., by fudging things that he technically "can't control" as inherent in the language specification). I think I'll try wrapping some preprocessor directives around select code sequences and building some M4 macros to massage the sources as I would think they could be. Then, look at the results.
From: Oliver Betz on 16 Apr 2010 15:32 Hello Don, [...] >>> OK, so you are watermarking *then* encrypting. But, you still >>> need a robust means of watermarking the executable (that can't >>> be easily altered, thwarted, etc.) *That's* the issue I am trying >>> to address. >> >> Since the executable is not directly accessible to the customer, I >> simply could put a notice e.g. in the boot message. The customer only >> sees the encrypted file and the bootloader will not accept it if it's >> changed. > >Presumably (though not a guaranteed fact), all of your images >are identical (i.e., the *decrypted* version). I.e., if I yes, I don't use any "watermarking". >picked up 2 of your devices and reverse engineered them, I >would see two things identical "under the hood", right? yes (if they have the same firmware revision). >So, if I *copy* the device-as-ready-to-accept-an-encrypted-image, >I can use any of your future released images. And, you can't >tell *which* particular device I used as the template for my >original "copy". > >The goal here is to acknowledge that copying *is* possible >(and affordable). But, to try to track where a "leak" but I expect copying to be as expensive as reverse engineering any watermarking you are thinking about. Have you any numbers about the cost to get the content of a flash microcontroller if it's "copy protection" is used? For example, we are using Freescale 9S08, S12, Coldfire V2 and I could also imagine to use a STM32. Oliver -- Oliver Betz, Munich despammed.com might be broken, use Reply-To:
From: George Neuner on 17 Apr 2010 01:37 On Fri, 16 Apr 2010 02:10:18 +0200, whygee <yg(a)yg.yg> wrote: >Hello, > >George Neuner wrote: >> I agree with whygee that the best way to mark an executable is to >> abuse equivalent code sequences. However, I think it should be done >> *after* compilation, using a patching tool and starting from the >> unaltered binary. > >In practice, post-patching is more difficult than source-code #defines. That's true, but it avoids the problem of trying to defeat the compiler. >My previous idea does not make any assumption about the target architecture >and C does not use carry bits so there is no risk of borking the binary. >Sure, it is a bit more cumbersome for the source code but I have done worse... >A "macro" could help there : > >#define ADD_IMM(src, imm, dest, define) \ >#ifdef define > {dest = src + imm;} >#else > {dest = src - (-imm);} >#fi >(a m4 script could be better) The problem is: both could be compiled into the very same instruction. You're trying to guess what the compiler will do and defeat it at the source level. That's highly tool chain dependent ... the next compiler may break everything. >To help "calibrate" the detection routines, it sounds interesting, >after the source code is ready, to compile with all #ifdef set, >and make another binary without any #ifdef. >The "XOR" of the two results will show where all the ADDs are, >with the added bonus that the -imm xor imm will show up >as plain 0xFF(FF(FFFF)) :-) plus a bit before for the opcode. > >That would be an interesting hack to try... > >Now, looking at code I wrote today, >I see very few constant adds. But I see a fair >amount of constants, which opens another door... >For example, imagine a system call : > >syscall(42) > >42 can be decomposed in myriads of ways, And any way that statically leads to 42 can be optimized away by the compiler. Do you seriously think that if you write syscall( 40 + 2 ) or syscall( (~0xD6) + 1 ) that the addition(s) or complement will be done at runtime? Even in debug mode few compilers are that dumb. Any integer operation involving constant operands can be done at compile time and the constant result substituted back into the code. The same is true for some floating point operations (but not all). >OK, I stop here, I feel too tired to think right, >but you see the idea (I hope) Yup. You're definitely tired. There may be things that can be done at the source level, but I can't think of any offhand and the ones you've presented are brittle and tool dependent at best. George
From: whygee on 17 Apr 2010 06:38
George Neuner wrote: > On Fri, 16 Apr 2010 02:10:18 +0200, whygee <yg(a)yg.yg> wrote: >> 42 can be decomposed in myriads of ways, > > And any way that statically leads to 42 can be optimized away by the > compiler. Do you seriously think that if you write > > syscall( 40 + 2 ) > or > syscall( (~0xD6) + 1 ) > > that the addition(s) or complement will be done at runtime? no, I intended to put the 2 constants in "volatile int"s. volatile int const42_1=40, const42_2=2; .... syscall (const42_1 + const42_2); .... I know it's quite harsh but given a sufficiently large program, only a fraction of the constants would be converted so the speed is not affected by the cache misses. >> OK, I stop here, I feel too tired to think right, >> but you see the idea (I hope) > Yup. You're definitely tired. There may be things that can be done > at the source level, but I can't think of any offhand and the ones > you've presented are brittle and tool dependent at best. sure, some compilers will want to be smart. yet they have to adhere to what the programmmer wants. Even at -03, GCC has to respect a "volatile" attribute. Now, I'm probably reinventing an old wheel, virus/trojan writers are wayyy beyond such methods. regards, > George yg -- http://ygdes.com / http://yasep.org |