From: Shawn on
"Jan Simon" <matlab.THIS_YEAR(a)nMINUSsimon.de> wrote in message <i157og$c2b$1(a)fred.mathworks.com>...

Hi Jan,

> Dear Shawn,
>
> > > I'm curious how to decode a binary file that is in some unknown structure..
> > > The file appears to be written in ieee-le machine format.
>
> How did you identify that?

I used the fopen callback feature to have matlab describe the machine format and encoding: i.e.
[filename, permission, machineformat, encoding] = fopen(fid)

>
> > 1) Unfortunately the company that created the file format has disappeared, so can't ask anyone to define the format.
>
> Perhaps you take the chance and post the name of the company and the extension and purpose of the files. The CSSM has a lot of readers, perhaps the grandma of a reader has invented the format...

Ok, the file that I am wanting to read has the extension '.BPF' which stands for 'bubble publishing form'.. The company Bubble Publishing created Optical Mark Reading software. This particular file describes the format of optical marks on a piece of paper in relation to the question number and their spatial position.
Bubble Publishing was purchased by Scantron and is no longer supported.

>
> > To elaborate on the problem - the information I would like is somewhere after the 1400th line of the binary file.
>
> I stop after this sentence already: Binary files usualy do not use "lines".
> The data could be compressed also...

This could be the case, however 'fgetl' reads and reports something - how it knows where to stop, i have no idea. What it reports are lines of mixed interpretable and non-interpreted characters and strings. Each 'line' is of different length.
I am able to search each line using regular expressions for a combination of characters. I ended up crating a count of the number of lines into the file I am while searching each line for a particular string that borders each variable field name, then I record the field name and it's line number in the file to define the ordering of the variable. Unfortunately, this doesn't work for every '.bpf' file that I've tried.. It would still be great to hear from someone who knows about the file structure.

>
> Jan

Thanks Jan,
Shawn
From: Jan Simon on
Dear Shawn,

> > > > The file appears to be written in ieee-le machine format.
> I used the fopen callback feature to have matlab describe the machine format and encoding: i.e.
> [filename, permission, machineformat, encoding] = fopen(fid)

By this way "permission", "machineformat" and "encoding" equal the settings, which have been use when opening the file with FOPEN(FileName).
Try it:
file = which('plot.m'); % Arbitrary file!
fid = fopen(file, 'rb', 'b'); % 'b' is synonym for ieee-be
[name, perm, fmt] = fopen(fid)
fclose(fid);
fid = fopen(file, 'rb', 'l'); % synonym for ieee-le
[name, perm, fmt] = fopen(fid)
The last is the default on PCs. Therefore you do not know if the file is big-endian or low-endian.
See: "help fopen"

> This could be the case, however 'fgetl' reads and reports something - how it knows where to stop, i have no idea.

FGETL reads until a CHAR(10), CHAR(13) or CHAR([13, 10]) - this can differ if the file is opened in binary or text mode (fopen("rb") or fopen("ra")). In binary files bytes with the contents 10 or 13 appear from time to time, e.g. in the DOUBLE 6990767227 ==> written to the file as: "0, 0, 176, 71, 234, 10, 250, 65" (at least this is the result of TYPECAST to UINT8 - the le/be could change the order). There you find a CHAR(10), but this is *not* a line break!

My impression: as long as nobody reveals more details about the format, you have no chance to read the file. Spend your time for something else.

Kind regards, Jan
From: Steven_Lord on


"Shawn " <shantry(a)geemail.com> wrote in message
news:i252u8$28h$1(a)fred.mathworks.com...
> "Jan Simon" <matlab.THIS_YEAR(a)nMINUSsimon.de> wrote in message
> <i157og$c2b$1(a)fred.mathworks.com>...

*snip*

>> Perhaps you take the chance and post the name of the company and the
>> extension and purpose of the files. The CSSM has a lot of readers,
>> perhaps the grandma of a reader has invented the format...
>
> Ok, the file that I am wanting to read has the extension '.BPF' which
> stands for 'bubble publishing form'.. The company Bubble Publishing
> created Optical Mark Reading software. This particular file describes the
> format of optical marks on a piece of paper in relation to the question
> number and their spatial position. Bubble Publishing was purchased by
> Scantron and is no longer supported.

Then I think your best bet is probably going to be to contact Scantron and
see if they're willing to provide you with the specification for the format
or a conversion tool to convert your data set into something that either can
be read by a supported application or whose format is documented.

--
Steve Lord
slord(a)mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ
To contact Technical Support use the Contact Us link on
http://www.mathworks.com