Prev: Unable to read Unicode file into SAS file - able to do after
Next: Subsetting data based on date range
From: Murphy Choy on 19 Jan 2010 11:00 Hi, Have you using utf-8 encoding? It works for some unicode files. ------Original Message------ From: Cherish K Sender: SAS(r) Discussion To: SAS-L(a)LISTSERV.UGA.EDU ReplyTo: Cherish K Subject: Unable to read Unicode file into SAS file - able to do after using 'convert into DOS'. Please help Sent: Jan 19, 2010 11:36 PM From MS SQL, I extracted a huge file of size 2.5GB. But the file was unicode encoded. So the usual import function wasn't working. From the archives I found a code written by DATA_NULL which says to use filename and encoding option; Below is the code given by data null filename chr "File_LOCATION\FILENAME.csv" encoding = Unicode; proc import datafile = chr out = chk DATAFILE=chr DBMS=DLM REPLACE; DELIMITER = '|'; GETNAMES=YES; DATAROW=2; RUN; what was happening is initial few columns were getting imported properly, i.e. out of 97 columns 25 were imported properly, 5-7 partial import, and the rest blanks. When i opened the file in textpad, i could see that all the fields are populated. What might be the problem? I used convert to DOS option in textpad and then converted using usual import option and it was working well i.e. all columns are getting imported properly. But to do this I had to split the file into smaller ones so that I can open it in textpad and then use convert to DOS option. Is there a way in SAS where I can directly convert the unicode csv file to SAS (reading all the columns?) Regards, Cherish Sent from my BlackBerry Wireless Handheld -- Regards, Murphy Choy Certified Advanced Programmer for SAS V9 Certified Basic Programmer for SAS V9
From: Arthur Tabachneck on 19 Jan 2010 11:43 Another possibility is that proc import isn't correctly getting the right formats and informats. If you already know what those formats and informats should be, you can always copy, paste and edit (from the log) the data step that resulted from the proc import run, adjusting any incorrect formats and informats and then just run it as a data step. Also, while I don't think it would make any difference, your proc import contains two "datafile=chr" statements. Art -------- On Tue, 19 Jan 2010 21:06:11 +0530, Cherish K <c4cherish(a)GMAIL.COM> wrote: >From MS SQL, I extracted a huge file of size 2.5GB. But the file was unicode >encoded. So the usual import function wasn't working. > >From the archives I found a code written by DATA_NULL which says to use >filename and encoding option; > >Below is the code given by data null > >filename chr "File_LOCATION\FILENAME.csv" encoding = Unicode; > >proc import datafile = chr out = chk >DATAFILE=chr >DBMS=DLM REPLACE; >DELIMITER = '|'; >GETNAMES=YES; >DATAROW=2; >RUN; > >what was happening is initial few columns were getting imported properly, >i.e. out of 97 columns 25 were imported properly, 5-7 partial import, and >the rest blanks. > >When i opened the file in textpad, i could see that all the fields are >populated. What might be the problem? > >I used convert to DOS option in textpad and then converted using usual >import option and it was working well i.e. all columns are getting imported >properly. > > But to do this I had to split the file into smaller ones so that I can open >it in textpad and then use convert to DOS option. > >Is there a way in SAS where I can directly convert the unicode csv file to >SAS (reading all the columns?) > >Regards, >Cherish
From: Cherish K on 19 Jan 2010 11:57
Your first solution worked. proc import wasn't able to read the file because of the default lrecl option. Giving lrecl = 32767 solved the problem. Thanks a lot Arthur. Regards, Cherish On Tue, Jan 19, 2010 at 10:13 PM, Arthur Tabachneck <art297(a)netscape.net>wrote: > Another possibility is that proc import isn't correctly getting the right > formats and informats. If you already know what those formats and > informats should be, you can always copy, paste and edit (from the log) > the data step that resulted from the proc import run, adjusting any > incorrect formats and informats and then just run it as a data step. > > Also, while I don't think it would make any difference, your proc import > contains two "datafile=chr" statements. > > Art > -------- > On Tue, 19 Jan 2010 21:06:11 +0530, Cherish K <c4cherish(a)GMAIL.COM> wrote: > > >From MS SQL, I extracted a huge file of size 2.5GB. But the file was > unicode > >encoded. So the usual import function wasn't working. > > > >From the archives I found a code written by DATA_NULL which says to use > >filename and encoding option; > > > >Below is the code given by data null > > > >filename chr "File_LOCATION\FILENAME.csv" encoding = Unicode; > > > >proc import datafile = chr out = chk > >DATAFILE=chr > >DBMS=DLM REPLACE; > >DELIMITER = '|'; > >GETNAMES=YES; > >DATAROW=2; > >RUN; > > > >what was happening is initial few columns were getting imported properly, > >i.e. out of 97 columns 25 were imported properly, 5-7 partial import, and > >the rest blanks. > > > >When i opened the file in textpad, i could see that all the fields are > >populated. What might be the problem? > > > >I used convert to DOS option in textpad and then converted using usual > >import option and it was working well i.e. all columns are getting > imported > >properly. > > > > But to do this I had to split the file into smaller ones so that I can > open > >it in textpad and then use convert to DOS option. > > > >Is there a way in SAS where I can directly convert the unicode csv file to > >SAS (reading all the columns?) > > > >Regards, > >Cherish > |