Replacing ',' with '.' [Matlab]

Prev: Flight gear Simulink
Next: error saving figure

From: Jake the Snake on 6 Oct 2006 05:20

Rune Allnor wrote:
>
> If you have one file and the decimal separators are the only commas
> in the file, load the file into some ascii text editor (notepad,
> emacs,
>
> maube even matlab's editor) and use the "find..replace" function.
>
> It will take you 2 seconds to do and the computer may be running
> for a couple of minutes. Don't spend more of YOUR time (the
> computer
> run-time is irrelevant if this truly is a one-off incident) with
> this
> than
> necessary, unless there are other complicating factors or you
> expect
> or fear that this number format will occur again.
>
> Rune
>
Well that's just the thing. I have constant stream of those files coming in and each has something like half a million commas. I've tried notepad and Word. Notepad crashes and Word does it, but it takes 15 minutes. Ok, I CAN do this, but I was just wondering if I can do it more efficiently with some selfmade code.

From: Michael Wild on 6 Oct 2006 10:58

On Fri, 06 Oct 2006 09:20:59 -0400, Jake the Snake wrote:

> Rune Allnor wrote:
>>
>> If you have one file and the decimal separators are the only commas
>> in the file, load the file into some ascii text editor (notepad,
>> emacs,
>>
>> maube even matlab's editor) and use the "find..replace" function.
>>
>> It will take you 2 seconds to do and the computer may be running
>> for a couple of minutes. Don't spend more of YOUR time (the
>> computer
>> run-time is irrelevant if this truly is a one-off incident) with
>> this
>> than
>> necessary, unless there are other complicating factors or you
>> expect
>> or fear that this number format will occur again.
>>
>> Rune
>>
> Well that's just the thing. I have constant stream of those files coming in and each has something like half a million commas. I've tried notepad and Word. Notepad crashes and Word does it, but it takes 15 minutes. Ok, I CAN do this, but I was just wondering if I can do it more efficiently with some selfmade code.

if the number of files is significant and you rely on matlab being able to
read them in very fast (not reading it first as text, converting the ",",
and then parsing it) you might consider writing some program to do it
(e.g. in c or c++) or applying a specific tool such as awk. in fact, there
is a windows version of gawk
(http://gnuwin32.sourceforge.net/packages/gawk.htm).

michael

From: Rune Allnor on 6 Oct 2006 14:33

Jake the Snake skrev:
> Rune Allnor wrote:
> >
> > If you have one file and the decimal separators are the only commas
> > in the file, load the file into some ascii text editor (notepad,
> > emacs,
> >
> > maube even matlab's editor) and use the "find..replace" function.
> >
> > It will take you 2 seconds to do and the computer may be running
> > for a couple of minutes. Don't spend more of YOUR time (the
> > computer
> > run-time is irrelevant if this truly is a one-off incident) with
> > this
> > than
> > necessary, unless there are other complicating factors or you
> > expect
> > or fear that this number format will occur again.
> >
> > Rune
> >
> Well that's just the thing. I have constant stream of those files coming in and each has something like half a million commas. I've tried notepad and Word. Notepad crashes and Word does it, but it takes 15 minutes. Ok, I CAN do this, but I was just wondering if I can do it more efficiently with some selfmade code.

I have some C++ code that does the conversion in some 20 MB of
ASCII numbers in just less than a second. I have tried to get it to
compile
as a MEX file, but there are some memory issues with matlab.

I could post a stand-alone version if you have access to a C++
compiler.

Rune

From: Rune Allnor on 7 Oct 2006 00:21

Rune Allnor skrev:
> Jake the Snake skrev:
> > Rune Allnor wrote:
> > >
> > > If you have one file and the decimal separators are the only commas
> > > in the file, load the file into some ascii text editor (notepad,
> > > emacs,
> > >
> > > maube even matlab's editor) and use the "find..replace" function.
> > >
> > > It will take you 2 seconds to do and the computer may be running
> > > for a couple of minutes. Don't spend more of YOUR time (the
> > > computer
> > > run-time is irrelevant if this truly is a one-off incident) with
> > > this
> > > than
> > > necessary, unless there are other complicating factors or you
> > > expect
> > > or fear that this number format will occur again.
> > >
> > > Rune
> > >
> > Well that's just the thing. I have constant stream of those files coming in and each has something like half a million commas. I've tried notepad and Word. Notepad crashes and Word does it, but it takes 15 minutes. Ok, I CAN do this, but I was just wondering if I can do it more efficiently with some selfmade code.
>
> I have some C++ code that does the conversion in some 20 MB of
> ASCII numbers in just less than a second. I have tried to get it to
> compile
> as a MEX file, but there are some memory issues with matlab.

The memory issues are solved, so below is the MEXable C++ version.
I always thought that the matlab LCC C compiler handled C++ code.
It seems, unfortunately, it doesn't. So you need an external C++
compiler
to get this to run... but then, it shouldn't be too hard to convert
this to C.

Oh well.

Save the code to some cpp file, say, convert.cpp,
mex it, and use the routine as

convert('input.txt','output.txt');

Note that the routine converts ALL commas in a file to dot.

On my computer (1.4 GHz) 20+ MB worth of text is scanned
and all commas converted in 1 s to 2 s.

Rune

/*************************************************************/
#include <fstream>
#include "mex.h"

using namespace std;

extern void _main();

void mexFunction(
int nlhs,
mxArray *plhs[],
int nrhs,
const mxArray *prhs[]
)
{
char* fname0;
char* fname1;
ifstream fin;
ofstream fout;

if (nrhs != 2) {
mexErrMsgTxt("Exacly two input arguments required.");
}

if((!mxIsChar(prhs[0]))||(!mxIsChar(prhs[1]))){
mexErrMsgTxt("Inputs must be of type char.");
}

int N0 = mxGetN(prhs[0]);
int M0 = mxGetM(prhs[0]);
int N1 = mxGetN(prhs[1]);
int M1 = mxGetM(prhs[1]);

if((M0!=1)||(M1!=1)){
mexErrMsgTxt("Inputs must have exactly one row.");
}

fname0 = (char*)mxMalloc(N0+1);
fname0 = mxArrayToString(prhs[0]);
fname1 = (char*)mxMalloc(N1+1);
fname1 = mxArrayToString(prhs[1]);

fin.open(fname0,ios_base::binary);
if (fin.bad())
{
mexErrMsgTxt("Could not open source file.");
}
fin.seekg(0,ios::end);
long int buffersize = fin.tellg();
fin.seekg(0,ios::beg);

fout.open(fname1,ios_base::binary);

char* buffer = (char*) mxMalloc(buffersize);
long int n;

fin.read(buffer,buffersize);
for (n=0;n<buffersize;n++)
{
if (buffer[n]==',')
{
buffer[n] = '.';
}
}

fout.write(buffer,buffersize);
fout.close();
fin.close();

mxFree(fname0);
mxFree(fname1);
mxFree(buffer);

return;
}

From: per isakson on 7 Oct 2006 01:24

Francis Burton wrote:
>
>
> In article <4524edd1$1(a)news1.ethz.ch>,
> Michael Wild <themiwi.REMOVE.THIS(a)student.ethz.ch> wrote:
>>or if you are on linux/unix/mac and this is not a single
incident,
> then
>>you might want to read up on a tool called awk.
>
> Or, even simpler, tr(1).
>
> If the poster really wants to do it in Matlab, he could try
> something like:
>
> tname = tempname;
> fidi=fopen(fname, 'r');
> fido=fopen(tname, 'w');
> while feof(fidi) == 0
> line = fgetl(fidi);
> line = strrep(line, ',', '.');
> fprintf(fido, '%s\n', line);
> end
> fclose(fidi);
> fclose(fido);
> m = dlmread(tname); % or other 'import' of choice
> delete(tname);
>
> Not the most efficient method, but easy to understand.
>
> Francis
>

If the size of the text file is less than half of the available
memory here is function that creates a file with decimal points. All
commas are replace by points.

>> tic, comma2point( 'd:\slask\comma.txt', 'd:\slask\point.txt' ),
toc
Elapsed time is 0.050921 seconds.
>>

function comma2point( strInFileSpec, strOutFileSpec )

fid = fopen( strInFileSpec, 'r' );
str = fread( fid, inf, '*char' );
fclose( fid );

str = strrep( str', ',', '.' );

fid = fopen( strOutFileSpec, 'w' );
fwrite( fid, str, 'char' );
fclose( fid );

end

/ per

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: Flight gear Simulink
Next: error saving figure