From: Shawn on

Hello MathWorks Community,

I'm curious how to decode a binary file that is in some unknown structure..
The file appears to be written in ieee-le machine format.

I can open the file and read it using fopen and fread. fread works well, but it doesn't reveal the nature of the delimiters. It seems each line of the file is of different lengths. alternatively fgetl works to read line by line, but only provides the char representation of the data.

Ideally, I would like to scan the file for a known string which will indicate where a list of fields exists within the file, which I would like to read in order to reveal the order in which some number, N of fields occurs. The strings appear in char format amidst a undetermined number of binary bytes.
From: Jan Simon on
Dear Shawn,

> I'm curious how to decode a binary file that is in some unknown structure..

Because there is an infinite number of possibilities for unknown structures, there is only one efficient and secure method:
Ask the person who has created the file.
Even if it is an employee of the CIA the chance to be successful is higher than starting to guess what might be matching for some, but not for all instances of files.

Kind regards, Jan
From: us on
"Shawn " <shantry(a)geemail.com> wrote in message <i0inht$qo$1(a)fred.mathworks.com>...
>
> Hello MathWorks Community,
>
> I'm curious how to decode a binary file that is in some unknown structure..
> The file appears to be written in ieee-le machine format.
>
> I can open the file and read it using fopen and fread. fread works well, but it doesn't reveal the nature of the delimiters. It seems each line of the file is of different lengths. alternatively fgetl works to read line by line, but only provides the char representation of the data.
>
> Ideally, I would like to scan the file for a known string which will indicate where a list of fields exists within the file, which I would like to read in order to reveal the order in which some number, N of fields occurs. The strings appear in char format amidst a undetermined number of binary bytes.

a hint:
- use one of the many(!) dumpers floating around in the web...

us
From: Shawn on
"Shawn " <shantry(a)geemail.com> wrote in message <i0inht$qo$1(a)fred.mathworks.com>...
>
> Hello MathWorks Community,
>
> I'm curious how to decode a binary file that is in some unknown structure..
> The file appears to be written in ieee-le machine format.
>
> I can open the file and read it using fopen and fread. fread works well, but it doesn't reveal the nature of the delimiters. It seems each line of the file is of different lengths. alternatively fgetl works to read line by line, but only provides the char representation of the data.
>
> Ideally, I would like to scan the file for a known string which will indicate where a list of fields exists within the file, which I would like to read in order to reveal the order in which some number, N of fields occurs. The strings appear in char format amidst a undetermined number of binary bytes.


Thanks for the suggestions..
1) Unfortunately the company that created the file format has disappeared, so can't ask anyone to define the format.
2) Dumpers - maybe, not sure about that stuff.. would like to only use matlab to get in there and finesse out the appropriate information.


To elaborate on the problem - the information I would like is somewhere after the 1400th line of the binary file. I am unable to identify newlines without the envoking of fgetl, which seems strange (there must be an easy way to identify the end of a line). There does not seem to be a block structure to the binary file; i.e. each line has some unknown number of bytes. The file is encoded in IEEE-le windows-1252.

I would like to scan the file for a particular string that follows each unique field variable string (which can change between each of the similar binary files, so I cannot search for the field variable strings themselves) in order to determine the ordering of the field variables encoded in another data file (that this file defines).

I have identified a list of uint8 numbers that compose the marker that follows all field variables (e.g. 10 0 70 105 101 108 100) and would like to create a way to search for this in each line keeping track of the order of each associated field variable listed just before this entry. Hoever, each variable string is of unknown length and also bordered by some other uninterpretable characters.. (e.g. 0 64 2 0). What would be a good way in matlab to search for the first identifier then rewind along that line to the second identifier to reveal the string of characters between them?
From: Jan Simon on
Dear Shawn,

> > I'm curious how to decode a binary file that is in some unknown structure..
> > The file appears to be written in ieee-le machine format.

How did you identify that?

> 1) Unfortunately the company that created the file format has disappeared, so can't ask anyone to define the format.

Perhaps you take the chance and post the name of the company and the extension and purpose of the files. The CSSM has a lot of readers, perhaps the grandma of a reader has invented the format...

> To elaborate on the problem - the information I would like is somewhere after the 1400th line of the binary file.

I stop after this sentence already: Binary files usualy do not use "lines".
The data could be compressed also...

Jan