Prev: 443413 M3i Zero , Ezflash Dsi , R4i Dsi 43531
Next: Can I get the Mime Content Type from a byte array?
From: Michelle on 15 Sep 2009 05:26 Peter, >[...] > As far as reading bytes from a specific position in the file, you have to > set the Position property to the position where you want to read, and > then you simply read. It's just that simple. Clear. > Note that the Decimal numeric type takes up 16 bytes in a file. So if you > are reading only 4 bytes, then obviously either you are reading the data > as the wrong format, or not reading enough bytes. Either way, you won't > get valid results. Bytes read: 00-C0-22-4F Decimal value: 00C0224F -> 12591695 It starts always (far as we know now) with 0x00, so actually I only need 3 bytes It's the way it's used in my file. > Do you get the values you expect when you do that? If so, then your > number must not be Decimal in the first place. Yes, as expected. > Basically, you've provided no information here that would allow anyone to > know for sure what the format of the numbers in your file are. [...] We're reverse engineering a proprietary file. As mentioned above i'd need to read 3/4 bytes and convert it to a Decimal value. In my earlier postings, I'll assume that I only wanted to search for one pattern. But during the reverse engineering we discovered a second one. I'm now looking for some 'Rabin-Karp algorithm' C# examples. I think that's a better solution then run the brute-force search twice. As you can see, the truth is closer today than yesterday ;-)) Best regards, Michelle
From: Michelle on 15 Sep 2009 10:37 UPDATE [ . . . ] > But during the reverse engineering we discovered a second one. > I'm now looking for some 'Rabin-Karp algorithm' C# examples. > I think that's a better solution then run the brute-force search twice. The challenge is even greater. The record header contains two variables. So the search must take place at two locations with wildcards. 0x07 0x00 0x?? 0x00 0x00 0x00 0x07 0x00 0x?? 0x00 0x00 0x00 0x08 0x00 (We know the possible byte values) So my only solution is to use Regex ? Regards, Michelle
From: Tom Spink on 15 Sep 2009 14:10 Michelle wrote: > UPDATE > > [ . . . ] >> But during the reverse engineering we discovered a second one. >> I'm now looking for some 'Rabin-Karp algorithm' C# examples. >> I think that's a better solution then run the brute-force search twice. > > The challenge is even greater. The record header contains two variables. > So the search must take place at two locations with wildcards. > > 0x07 0x00 0x?? 0x00 0x00 0x00 0x07 0x00 0x?? 0x00 0x00 0x00 0x08 0x00 > > (We know the possible byte values) > > So my only solution is to use Regex ? > > Regards, > > Michelle > This sounds like a highly structured file - *surely* there is some sort of descriptor at the start of it that contains a pointer to these records. Presumably there is some software out there to read and write these files - I doubt very much they do any binary searching. There must be some kind of allocation table, or header structure that defines the rest of the file, even a pointer to the first record, which contains a pointer to the next (in a linked-list style). When programming becomes this complex, it's usually best to step back and take another look at the problem. -- Tom
From: Peter Duniho on 15 Sep 2009 14:10 On Tue, 15 Sep 2009 07:37:15 -0700, Michelle <michelle(a)notvalid.nomail> wrote: > UPDATE > > [ . . . ] >> But during the reverse engineering we discovered a second one. >> I'm now looking for some 'Rabin-Karp algorithm' C# examples. >> I think that's a better solution then run the brute-force search twice. > > The challenge is even greater. The record header contains two variables. > So the search must take place at two locations with wildcards. > > 0x07 0x00 0x?? 0x00 0x00 0x00 0x07 0x00 0x?? 0x00 0x00 0x00 0x08 0x00 > > (We know the possible byte values) > > So my only solution is to use Regex ? No, not necessarily. You could search for the sub-components individually. Look for one, then look for the other in the specific place it should be if you find the first. Though, if Regex makes the code simpler, it might well be worth it anyway, even if it doesn't perform as well. That said, you have a broader problem in that the more variability in the data that's allowed, the greater the chance that you'll find the pattern you're looking for, but not in the context you intend (i.e. a false positive search result). You have that chance even with a regular search pattern, but as the pattern gets shorter with more variation allowed, the odds increase. And I note that the above string of bytes is quite a bit different from, and quite a bit simpler than, the search pattern you showed in earlier posts. I would guess there's a much higher chance of seeing that pattern in the wrong context than the other. Frankly, the more you explain about the basic problem, the less I feel that a simple search-and-replace is really the right way to go about things. Files have structures; I can guarantee you whatever this kind of file is, the intended user code doesn't need to search for things. It simply parses the data and knows the precise location of particular kinds of data within the file. IMHO, your efforts would be better spent trying to reverse engineer the file to the point where you can accomplish the same, rather than investing effort on speeding up string searches on the data. Even better, just get the documentation for the file format and code from that, rather than all this investment in reverse-engineering. I obviously don't have all the details with respect to the "why"s, "what"s, etc. related to your problem. But it seems like you've taken a time-consuming, difficult path that is practically guaranteed to be the one that provides the least reliable results. I know that wouldn't be _my_ first choice approaching a problem like this. :) Pete
From: Tom Spink on 15 Sep 2009 15:18
Peter Duniho wrote: > I know that wouldn't be > _my_ first choice approaching a problem like this. :) My first choice is the Microsoft "documentation" for the PE format. ;) > Pete -- Tom |