From: jihane on
Hello all,

I have a 4Gb txt file and it is not possible to work with that because it is huge. Does someone knows how to convert an ascii file to a binary file using mathematica? I have a collegue that does it with another system but I can't find out how to do that. I searched a lot without sucess.

Your help would be extremelly appreciated.

Thank you!

From: Bill Rowe on
On 4/2/10 at 5:22 AM, jihane.ajaja(a)mail.mcgill.ca (jihane) wrote:

>I have a 4Gb txt file and it is not possible to work with that
>because it is huge. Does someone knows how to convert an ascii file
>to a binary file using mathematica? I have a collegue that does it
>with another system but I can't find out how to do that. I searched
>a lot without sucess.

It is not difficult to convert an ASCII file to binary. There
are two ways that come readily to mind.

First, if the file contains just numeric data the following flow
would work

Open the file to be converted and open an output file to save
the binary representation. Once you have the data streams for
each of the above files, then doing

data=ReadList[inputStream, Number, n];
BinaryWrite[outputStream, data, format]

in a loop will do the trick. Here n is the number of items to
read at any one time.

If the contents of the file are not all numeric, then the loop
body would be something like

data = ReadList[inputStream, Byte, 4 n]
BinaryWrite[outputStream, FromDigits[#,
256]&/@Partition[data,4], "Integer32"]

But while the above will convert the ASCII file into a binary
representation, I strongly suspect this isn't going to be that
useful to you. The limitations on how large of a data set you
can work with in Mathematica are determined by the amount of RAM
you have and the way Mathematica represents the data internally.
They are independent of how the data is stored on disk.
Converting a large ASCII file to binary will reduce the file
size on disk and likely will reduce the time needed to read the
data back from disk. But it will not do anything to increase the
amount of data you can work with at one time in Mathematica.


From: Albert Retey on
Hi,

> I have a 4Gb txt file and it is not possible to work with that
> because it is huge. Does someone knows how to convert an ascii file
> to a binary file using mathematica? I have a collegue that does it
> with another system but I can't find out how to do that. I searched a
> lot without sucess.

Whether this is possible and how much effort it is depends a lot on what
the data in that file is. If it is mostly numbers you might have a
chance to read it at chunks and save the data in a binary format. Will
you really need to import all the data or just a part of it?
If your colleague has succesfully converted it to a binary file, why not
try if you can import from that format? Mathematica can import .mat
files for example. How large is the binary file he has created?

> Your help would be extremelly appreciated.

I think you will need to supply much more details to get any useful help
on this...

albert

From: David Bailey on
jihane wrote:
> Hello all,
>
> I have a 4Gb txt file and it is not possible to work with that because it is huge. Does someone knows how to convert an ascii file to a binary file using mathematica? I have a collegue that does it with another system but I can't find out how to do that. I searched a lot without sucess.
>
> Your help would be extremelly appreciated.
>
> Thank you!
>
I think it would help if you told us a lot more about the format of that
txt file, including perhaps a sample - but not the whole 4Gb :)

Manipulating that much data in one go may still be tricky, unless you
move to a 64-bit system with plenty of memory.

David Bailey
http://www.dbaileyconsultancy.co.uk

From: jihane on
Thank you for all your replies.
To give more details about my file: it is a file with numerical data, presented in 3 columns for x y and z axis. (they are acceleration measurements). My computer is a 64 bits machine with 8 GB of RAM. Why my file is that huge? well the measurements are done 1000 times per second. I can't ignore parts of the data. I need to analyze all of it. I don't think of any other useful detail I could provide. What I want to do with this data is do some basic calculations and generate a couple of plots.

Thank you again for all the great help!