Unable to read Unicode file into SAS file

Prev: Unable to read Unicode file into SAS file - able to do after
Next: Subsetting data based on date range

From: Murphy Choy on 19 Jan 2010 11:00

Hi,

Have you using utf-8 encoding? It works for some unicode files.

------Original Message------
From: Cherish K
Sender: SAS(r) Discussion
To: SAS-L(a)LISTSERV.UGA.EDU
ReplyTo: Cherish K
Subject: Unable to read Unicode file into SAS file - able to do after using 'convert into DOS'. Please help
Sent: Jan 19, 2010 11:36 PM

From MS SQL, I extracted a huge file of size 2.5GB. But the file was unicode
encoded. So the usual import function wasn't working.

From the archives I found a code written by DATA_NULL which says to use
filename and encoding option;

Below is the code given by data null

filename chr "File_LOCATION\FILENAME.csv" encoding = Unicode;

proc import datafile = chr out = chk
DATAFILE=chr
DBMS=DLM REPLACE;
DELIMITER = '|';
GETNAMES=YES;
DATAROW=2;
RUN;

what was happening is initial few columns were getting imported properly,
i.e. out of 97 columns 25 were imported properly, 5-7 partial import, and
the rest blanks.

When i opened the file in textpad, i could see that all the fields are
populated. What might be the problem?

I used convert to DOS option in textpad and then converted using usual
import option and it was working well i.e. all columns are getting imported
properly.

But to do this I had to split the file into smaller ones so that I can open
it in textpad and then use convert to DOS option.

Is there a way in SAS where I can directly convert the unicode csv file to
SAS (reading all the columns?)

Regards,
Cherish

Sent from my BlackBerry Wireless Handheld

--
Regards,
Murphy Choy

Certified Advanced Programmer for SAS V9
Certified Basic Programmer for SAS V9

From: Arthur Tabachneck on 19 Jan 2010 11:43

Another possibility is that proc import isn't correctly getting the right
formats and informats. If you already know what those formats and
informats should be, you can always copy, paste and edit (from the log)
the data step that resulted from the proc import run, adjusting any
incorrect formats and informats and then just run it as a data step.

Also, while I don't think it would make any difference, your proc import
contains two "datafile=chr" statements.

Art
--------
On Tue, 19 Jan 2010 21:06:11 +0530, Cherish K <c4cherish(a)GMAIL.COM> wrote:

>From MS SQL, I extracted a huge file of size 2.5GB. But the file was
unicode
>encoded. So the usual import function wasn't working.
>
>From the archives I found a code written by DATA_NULL which says to use
>filename and encoding option;
>
>Below is the code given by data null
>
>filename chr "File_LOCATION\FILENAME.csv" encoding = Unicode;
>
>proc import datafile = chr out = chk
>DATAFILE=chr
>DBMS=DLM REPLACE;
>DELIMITER = '|';
>GETNAMES=YES;
>DATAROW=2;
>RUN;
>
>what was happening is initial few columns were getting imported properly,
>i.e. out of 97 columns 25 were imported properly, 5-7 partial import, and
>the rest blanks.
>
>When i opened the file in textpad, i could see that all the fields are
>populated. What might be the problem?
>
>I used convert to DOS option in textpad and then converted using usual
>import option and it was working well i.e. all columns are getting
imported
>properly.
>
> But to do this I had to split the file into smaller ones so that I can
open
>it in textpad and then use convert to DOS option.
>
>Is there a way in SAS where I can directly convert the unicode csv file to
>SAS (reading all the columns?)
>
>Regards,
>Cherish

From: Cherish K on 19 Jan 2010 11:57

Your first solution worked. proc import wasn't able to read the file because
of the default lrecl option. Giving lrecl = 32767 solved the problem.

Thanks a lot Arthur.

Regards,
Cherish

On Tue, Jan 19, 2010 at 10:13 PM, Arthur Tabachneck <art297(a)netscape.net>wrote:

> Another possibility is that proc import isn't correctly getting the right
> formats and informats. If you already know what those formats and
> informats should be, you can always copy, paste and edit (from the log)
> the data step that resulted from the proc import run, adjusting any
> incorrect formats and informats and then just run it as a data step.
>
> Also, while I don't think it would make any difference, your proc import
> contains two "datafile=chr" statements.
>
> Art
> --------
> On Tue, 19 Jan 2010 21:06:11 +0530, Cherish K <c4cherish(a)GMAIL.COM> wrote:
>
> >From MS SQL, I extracted a huge file of size 2.5GB. But the file was
> unicode
> >encoded. So the usual import function wasn't working.
> >
> >From the archives I found a code written by DATA_NULL which says to use
> >filename and encoding option;
> >
> >Below is the code given by data null
> >
> >filename chr "File_LOCATION\FILENAME.csv" encoding = Unicode;
> >
> >proc import datafile = chr out = chk
> >DATAFILE=chr
> >DBMS=DLM REPLACE;
> >DELIMITER = '|';
> >GETNAMES=YES;
> >DATAROW=2;
> >RUN;
> >
> >what was happening is initial few columns were getting imported properly,
> >i.e. out of 97 columns 25 were imported properly, 5-7 partial import, and
> >the rest blanks.
> >
> >When i opened the file in textpad, i could see that all the fields are
> >populated. What might be the problem?
> >
> >I used convert to DOS option in textpad and then converted using usual
> >import option and it was working well i.e. all columns are getting
> imported
> >properly.
> >
> > But to do this I had to split the file into smaller ones so that I can
> open
> >it in textpad and then use convert to DOS option.
> >
> >Is there a way in SAS where I can directly convert the unicode csv file to
> >SAS (reading all the columns?)
> >
> >Regards,
> >Cherish
>

|
Pages: 1
Prev: Unable to read Unicode file into SAS file - able to do after
Next: Subsetting data based on date range