From: D Yuniskis on 18 Apr 2010 17:08 Hi Oliver, Oliver Betz wrote: >>> yes, I don't use any "watermarking". >> OK. So, if someone copied one of your devices, you would have >> no way of knowing which particular device was the source of >> the copy (?) > > no, I don't care. Nobody did so till now and there is also some other > specific knowledge he needed to be successful. I'm not claiming that you *do* care. Rather, I am trying to illustrate the different "problem" you are trying to address. *I* am trying to identify *which* of N "copies" of a device served as the genesis for a counterfeit product. This allows assets to be brought to bear on the "leak" (who probably doesn't have the deep pockets that the actual counterfeiter has and, as such, has much MORE, relatively speaking, to lose by his actions!) > "Watermarking" would be much more effort for us than advantage since > we sell to many customers and also via dealers. > > [...] > >>> Have you any numbers about the cost to get the content of a flash >>> microcontroller if it's "copy protection" is used? For example, we are >>> using Freescale 9S08, S12, Coldfire V2 and I could also imagine to use >>> a STM32. >> No idea. In the 80's, you could get a "generic" design copied > > Not a valid comparison. Let's talk about the current flash devices I > cited. If you tell me what it costs to get a Flash ROM image from one > of these, we can continue the effort / benefit discussion. I'm not interested in discussing counterfeiting techniques. There is a wealth of information in "public" channels on this subject -- as well as "clandestine" channels. The point I was making was that $50K was a paltry sum in the 80's. Less than a manyear of paid time. Sure, technologies have advanced since then. But, so have other technologies for "countering" those. E.g., there are firms that can produce a "schematic" of a full custom IC "automatically" using various imaging techniques (SEM, optical, XRAY, etc.) and apply "simple" pattern matching on the 3D layers imaged to identify the various "components" implemented therein. Something unheard of in the 80's (had to be done manually). And, there are now many more "players" in the market -- each looking for an "in". The folks you typically protected against 30 years ago are no longer the ones to worry about, now. For a bit of perspective, try a read of Huang's _Hacking the Xbox: An Introduction to Reverse Engineering_ and remember that this recounts a "lone gunman" tackling a product made by a company with *very* deep pockets (how does your firm compare to MS's resources?) >> Look at how "secure" the various "security technologies" >> have proven to be. Look at the tools available to students >> in labs nowadays (how long did the XBox's "security" stand >> up to "not for profit" attack?). > > Also not a valid comparison, tha attack is not applicable to flash > based devices without external memory. A device can be de-encapsulating and microprobed while executing. This sort of technology is no longer out of reach -- many universities have these capabilities. I.e., "individuals" can gain access to tools like this; imagine what *businesses* have available to them. >> Relying on schemes to *prevent* copying is a losing battle. > > As I wrote earlier, I just want to make copying so expensive that it's > no more interesting compared to "develop from scratch". Of course! The problem, as always, is finding that balance. One *sure* way to do it is to design a product that no one *wants* to copy! :> I.e., the more successful (in the marketplace) that you are, the more likely you are going to inspire folks to want to copy your design. I am leary of encryption as, historically, it has always managed to show weaknesses over time. New techniques for breaking codes, unforeseen vulnerabilities, etc. E.g., cracking passwords on PC's, WEP keys, etc. -- in *alarmingly* short times. (I think there is even a firm that will crack your WEP key for less than $100 and 24-48 hours). "Businesses" can be protected against, to some extent, with fear of litigation, seizing counterfeit imports at ports, etc. (depends on how big *you* are and how big the market). The trickier defense is protecting against well-motivated individuals as they now have the computing power, access to tools, *communities*, etc. to throw at anything they feel "worth copying". Then, trying to "silence" them from sharing what they have learned...
From: D Yuniskis on 18 Apr 2010 17:28 Hi George, George Neuner wrote: > On Fri, 16 Apr 2010 11:08:33 -0700, D Yuniskis > <not.going.to.be(a)seen.com> wrote: > >> George Neuner wrote: >> >>> I've read about "functional marking", changing the program's runtime >>> behavior based on some key ... for example, strobing indicator lights >>> in a pattern that's hard to discern by eye but could be clearly seen >>> using a movie camera. But I don't see an easy way to do something >>> that would remain non-obvious if the program were disassembled. >> No, I am not talking about anything that has to be obvious >> to a "special observer" -- other than an observer that >> can disassemble (meaning "decompose into small physical >> pieces") the device in question and compare it to a >> "template" suspected of being the original for the copy. >> >> What you want is something that an observer with two (or >> more) instances (avoiding the term "copy") of an executable >> will recognize as "different" -- but, won't be able to easily >> figure out how to convert either of them into "yet another" >> instance that retains all of the original functionality. > > That is a *very* different thing than watermarking ... watermarking is > simply a scheme to identify the source of an item that may be > counterfeited. Watermarking is of little use if the watermark can be easily identified and removed/altered. Indeed, it would be trivial to just embed "This is copy #027 of the product" in each particular instance. From http://www.cs.arizona.edu/~collberg/Research/Publications/CollbergThomborson99a/index.html (apologies if the URL folds) "The Software Watermarking problem can be described as follows. Embed a structure W into a program P such that: W can be reliably located and extracted from P even after P has been subjected to code transformations such as translation, optimization and obfuscation; W is stealthy; W has a high data rate; embedding W into P does not adversely affect the performance of P; and W has a mathematical property that allows us to argue that its presence in P is the result of deliberate actions. (complain to the author re: the academic-speak therein :> ) >>> I agree with whygee that the best way to mark an executable is to >>> abuse equivalent code sequences. However, I think it should be done >>> *after* compilation, using a patching tool and starting from the >> But those are trivial to identify. I.e., they have to >> occupy the same space in all "incantations". They will >> tend to be confined to very small pieces of memory >> (huge code sequences get harder to manipulate while >> still satisfying any control transfers out/in). And, >> they will tend to be simple -- analysis will make it >> readily apparent that the changes are "meaningless": >> "Here he adds 5; there he subtracts -5. <shrug>" > > Making it variable is possible but creates a problem identifying the > watermark. Again, I think you're after something different. See above. >>> unaltered binary. I also agree that the patches should be based on >>> some (well distributed) customer id - perhaps a crypt hash on the >>> customer information. >>> >>> You want to substitute code sequences of equal length (unless PIC code >>> makes it unnecessary). As an stupid example, it won't do any good to >>> replace an addition with a subtraction if you then also need to >>> set/clear a carry bit for the subtract instruction to work. You need >>> to find a set of substitutions that are safe to make and which won't >>> affect other code, but it helps that the substitution sequences can be >>> specific to the binary being marked. >> But, you see, that is exactly the sort of thing that makes >> this approach trivial to circumvent. > > It isn't trivial ... if you start with a well distributed customer id > (say a crypto hash of the customer info) which is 160..256 bits long > and only patch locations corresponding to '1' bits in the hash, > a counterfeiter would need many samples of patched executables to But, the counterfeiter can identify these types of transforms from as few as two copies and, changing just *one*, has now succeeded in changing the watermark! Depending on the Hamming distance between "successive (adjacent?)" watermarks, this can be enough to argue (in a court of law) that the device bearing the altered watermark was not, in fact, derived from the device having the unaltered watermark visible in the "defendant's" instance of the product. I.e. the problem is similar to that of authentication with digital signatures. You don't want a party to be able to forge someone else's signature; NOR BE ABLE TO DISAVOW THEIR OWN! > unravel the scheme. Even having the unaltered binary would not help > much. And the scheme can be made more complicated by using bits of > the customer id in groups and skipping over potential patch points > based on them. > >> Imagine, for example, compiling each instance with a different >> compiler and linkage editor. Or, with a different level >> of optimization. etc. (this won't work, either, because it >> can make *big* changes to the performance/requirements of >> the device). I.e., in each case, its the same "product" >> but the code images look very different. >> >>> In many ISAs it does not affect the length of the code sequence to >>> negate a comparison by reversing the operand order and then to branch >>> on the negative result. Obviously you don't want to do this just >>> anywhere because it potentially could mess up a multi-way branch, >>> however, you can safely negate/reverse any 2-way branch. During >>> coding, you could manually tag good candidates with a label and then >>> pick the labels out of a map file or assembler listing of your release >>> compile. >> <frown> I don't see this as a realistic way forward. It >> puts too much burden on the developer. And, doing it >> as a post process means the tools *developed* to do it >> would be complex -- would they introduce bugs, etc. >> >> I think, for a given, known toolchain, you could get >> the results I need just by clever manipulations of the >> sources -- *without* the participation or consent of >> the developer (i.e., by fudging things that he >> technically "can't control" as inherent in the >> language specification). >> >> I think I'll try wrapping some preprocessor directives >> around select code sequences and building some M4 macros >> to massage the sources as I would think they could be. >> Then, look at the results. > > I think that is a non-starter ... unless you use inline assembler > you'll be working against the compiler. See my post to whygee. Revisiting an example I posed previously, consider: int foo(...) { int A; int B; int C; <body> } vs. int foo(...) { int B; int C; int A; <body> } I.e., the two functions will behave the same regardless of the contents of <body>, correct? (barring code that is self-examining or self modifying) One could create M4 macros to wrap around each of these declarations such that you could "externally" massage the sources to effectively reorder the declarations. Right? (left as an exercise for the reader) A compiler *could* rearrange the stack frame to essentially rewrite the first instance of foo to be *identical* to the second (I believe this is allowed under a strictly conforming C compiler). But, what if the declarations were: int foo(...) { int array[3]; <body> } #define A (array[0]) #define B (array[1]) #define C (array[2]) permuted (for a different watermark) to: #define A (array[2]) #define B (array[0]) #define C (array[1]) I suppose a compiler could notice the invariant nature of the individual references and, IF THE array IS ENTIRELY "LOCAL", rearrange them (though it is hard to see why the compiler would *want* to do so... what is *gained* in such an optimization?) The resulting binaries would run in roughly identical time. Their results would be identical. Yet, the would be different binaries. "Watermarked" uniquely. This sort of source level translation would be easy to test (without having to develop a tool that automagically rewrote each function definition "conditionally").
From: D Yuniskis on 18 Apr 2010 17:30 Oliver Betz wrote: > George Neuner wrote: > > [...] > >> YMMV. I see it as a horrible maintenance problem generating different >> code for each client - it's bad enough dealing with platform > > ack, IMO that's the main problem. Such a method is only viable for > very small production volumes and expensive devices. Yeah -- like PRE-PRODUCTION BETA RELEASES!! ;-)
From: Oliver Betz on 19 Apr 2010 02:58 Hello Don, >>>> yes, I don't use any "watermarking". >>> OK. So, if someone copied one of your devices, you would have >>> no way of knowing which particular device was the source of >>> the copy (?) >> >> no, I don't care. Nobody did so till now and there is also some other >> specific knowledge he needed to be successful. > >I'm not claiming that you *do* care. Rather, I am trying >to illustrate the different "problem" you are trying to >address. *I* am trying to identify *which* of N "copies" >of a device served as the genesis for a counterfeit product. no need to repeat, you already explained that. [...] >I'm not interested in discussing counterfeiting techniques. [...] >For a bit of perspective, try a read of Huang's >_Hacking the Xbox: An Introduction to Reverse Engineering_ not really applicable, as I wrote. [...] >I am leary of encryption as, historically, it has always managed You mean "leery"? So you are biased due to lack of knowledge? In the end, it's a cost/benefit question. But thie discussion develops similar to other ones you started in the recent past. It seems you already decided what to do and you don't consider other methods impartially. Oliver -- Oliver Betz, Munich despammed.com might be broken, use Reply-To:
From: D Yuniskis on 20 Apr 2010 03:00
Nobody wrote: > On Thu, 15 Apr 2010 19:30:36 -0400, George Neuner wrote: > >> I've read about "functional marking", changing the program's runtime >> behavior based on some key ... for example, strobing indicator lights >> in a pattern that's hard to discern by eye but could be clearly seen >> using a movie camera. But I don't see an easy way to do something >> that would remain non-obvious if the program were disassembled. > > This gets around an issue with equivalent code sequences, namely that > decompiling then recompiling with optimisation will tend to eliminate any > watermarks. Decompiling large pieces of code is expensive and requires careful analysis. E.g., I rely heavily on the use of pointers in much of my code (I think this is characteristic of folks coming from hardware/ASM backgrounds). Since those pointers can be algorithmically generated, its hard to imagine an automatic tool sorting out the possible targets for a particular pointer (variable). I've never tried to tackle a "hand decompiled" project bigger than 100KB. And, "machine assisted" beyond 1MB is just too trying for me. Its hard to imagine folks -- even *teams* -- trying to tackle a multimegabyte project like that. :< When I looked at the code in my Unisite, I found it easier to just plug in an ICE and turn on the trace buffer. :-/ But, getting from *there* to something that you could *modify* was an entirely different scope. (N.B. Data I/O products tend to be fun to study when interested in protecting designs as they are REALLY paranoid about their IP!) > If the watermark affects the code's observable behaviour in any way, > then it would have to be preserved by any "equivalent" code. Removing the > watermark would require understanding the code to the extent that it could > be modified such that the modified behaviour was merely "close enough" > rather than identical. Ideally, the modified code has no observable differences -- other than the actual memory image (at *run* time). In practice, this is often hard to achieve as just moving things around in memory can make slight changes to execution timing, etc. as cache lines are purged differently, default states of jumps change, etc. This can work to your advantage -- *or* against you depending on the magnitude of the changes. |