From: Mark on
Hi,

I am using a BufferedReader to read character data in from a file. It
works but it's incredibly slow. (The file consists of a number of
separate messages, each separated by a special character. Each
message must be read into a separate string.)

I use the following code (exception handling removed for brevity):

String text = new String("");
BufferedReader in = null;
in = new BufferedReader(new InputStreamReader(new
FileInputStream(_msgFile)));
int c;
while ((c = in.read()) != -1) {
if (c == '@') {
_msgList.add(text);
text = "";
} else {
text += (char)c;
}
}
if (text.length() > 0) {
_msgList.add(text);
}

--
(\__/) M.
(='.'=) Due to the amount of spam posted via googlegroups and
(")_(") their inaction to the problem. I am blocking most articles
posted from there. If you wish your postings to be seen by
everyone you will need use a different method of posting.
[Reply-to address valid until it is spammed.]

From: bugbear on
Mark wrote:
> Hi,
>
> I am using a BufferedReader to read character data in from a file. It
> works but it's incredibly slow. (The file consists of a number of
> separate messages, each separated by a special character. Each
> message must be read into a separate string.)
>
> I use the following code (exception handling removed for brevity):
>
> String text = new String("");
> BufferedReader in = null;
> in = new BufferedReader(new InputStreamReader(new
> FileInputStream(_msgFile)));
> int c;
> while ((c = in.read()) != -1) {
> if (c == '@') {
> _msgList.add(text);
> text = "";
> } else {
> text += (char)c;
> }
> }
> if (text.length() > 0) {
> _msgList.add(text);
> }
>

Try working out (as near as you can) what the line

text += (char)c;

does.

BugBear
From: Roedy Green on
On Thu, 21 Jan 2010 10:31:04 +0000, Mark
<i(a)dontgetlotsofspamanymore.invalid> wrote, quoted or indirectly
quoted someone who said :

>I am using a BufferedReader to read character data in from a file.

You can read the file in one giant unbuffered i/o which is about as
fast as you can get. Keep in mind your file in 8-bit encoding of some
sort and it is being translated byte by byte to 16-bit unicode.

For code see http://mindprod.com/products1.html#HUNKIO

You can also bump up the buffer size. see
http://mindprod.com/applet/fileio.html

In general, don't pester the OS for i/o on a char by char basis. Read
in whacking huge hunks and do your scanning with indexOf etc.


--
Roedy Green Canadian Mind Products
http://mindprod.com
Responsible Development is the style of development I aspire to now. It can be summarized by answering the question, �How would I develop if it were my money?� I�m amazed how many theoretical arguments evaporate when faced with this question.
~ Kent Beck (born: 1961 age: 49) , evangelist for extreme programming .
From: Roedy Green on
On Thu, 21 Jan 2010 10:31:04 +0000, Mark
<i(a)dontgetlotsofspamanymore.invalid> wrote, quoted or indirectly
quoted someone who said :

> text += (char)c;

use a StringBuilder to accumulate the text. Guess a generous starting
size.

see http://mindprod.com/jgloss/stringbuilder.html
--
Roedy Green Canadian Mind Products
http://mindprod.com
Responsible Development is the style of development I aspire to now. It can be summarized by answering the question, �How would I develop if it were my money?� I�m amazed how many theoretical arguments evaporate when faced with this question.
~ Kent Beck (born: 1961 age: 49) , evangelist for extreme programming .
From: saif al islam on
Hi,
There is two things u can do to improve that code
1. use ByteBuffer and FileChannel
2. instead of adding char by char, create the string in one shot

a sample code can be as below, for a 100M file it reduces the time
from 32 sec to 2 sec on my PC (without the time to add to a vector)

FileInputStream f = new FileInputStream( "fileName" );
int SIZE=1000;
String text;

FileChannel ch = f.getChannel( );
ByteBuffer bb = ByteBuffer.allocateDirect( SIZE );
byte[] barray = new byte[SIZE];
String leftOver="";
int nRead, nGet;
while ( (nRead=ch.read( bb )) != -1 )
{
if ( nRead == 0 )
continue;
bb.position( 0 );
bb.limit( nRead );
while( bb.hasRemaining( ) )
{
nGet =(int) Math.min( bb.remaining( ), SIZE );
bb.get( barray, 0, nGet );
int start=0;
for ( int i=0; i<nGet; i++ ){
if(barray[i]==64 ){ /*ascii value of '@'=64 */
text=leftOver+new String(barray,start,i-start);
start=i+1;
leftOver="";
_msgList.add(text);
// System.out.println(text);
}

}
if(nGet>start){
leftOver=new String(barray,start,nGet-start);
}
}
bb.clear( );
}
if (leftOver.length() > 0) {

// System.out.println("----"+leftOver);
_msgList.add(text);

}

Hope it helps

Saif
On Jan 21, 1:35 pm, bugbear <bugbear(a)trim_papermule.co.uk_trim> wrote:
> Mark wrote:
> > Hi,
>
> > I am using a BufferedReader to read character data in from a file.  It
> > works but it's incredibly slow.  (The file consists of a number of
> > separate messages, each separated by a special character.  Each
> > message must be read into a separate string.)
>
> > I use the following code (exception handling removed for brevity):
>
> >             String text = new String("");
> >             BufferedReader in = null;
> >             in = new BufferedReader(new InputStreamReader(new
> > FileInputStream(_msgFile)));
> >             int c;
> >             while ((c = in.read()) != -1) {
> >                 if (c == '@') {
> >                     _msgList.add(text);
> >                     text = "";
> >                 } else {
> >                     text += (char)c;
> >                 }
> >             }
> >             if (text.length() > 0) {
> >                 _msgList.add(text);
> >             }
>
> Try working out (as near as you can) what the line
>
>   text += (char)c;
>
> does.
>
>      BugBear