From: Elvira Sojli on 8 Jan 2010 06:02 Dear Art and Tom, =20 Thanks again for the quick replies. I have spent the morning going through different things, trying to locate the problem more specifically.=20 =20 1. So like Art I created a dummy file using the first lines of my problematic csv's and zipped with WINZIP (this is essential for saszipam to work). I can unzip and read the file without a problem. The problem reoccurs when the file is very long. The SAS error messages are below, for the file that works and the one that doesn't.=20 2. When I unzip the file and run the code, I can read the CSV without problems regardless of the length of the file. I don't want to do this for the thousands of files I have, which are all very large even zipped. =20 3. I tried to use the PIPE command with winzip unzip and it does not give any useful output as it is designed to read TXT files. =20 4. Tom your suggestion of using TRUNCOVER or the delete option don't work. I also tried *dlm=3D'2C0D'x TERMSTR=3DCRLF* end eof=3Dlastrec but = they also don't work. So I guess reading the csv as text might be my only option left. =20 Thanks again for your time and effort. =20 Elvira=20 =20 NOTE: The infile library IN is: Stream=3DF:\Cash\EuronextAmsterdam-BBO-200404-2.zip =20 NOTE: The infile IN(EuronextAmsterdam-BBO-200404-2.csv) is: File Name=3DEuronextAmsterdam-BBO-200404-2.csv, Compressed Size=3D34188101, Uncompressed Size=3D404818647, Compression Level=3D-1,Clear Text=3DNo =20 ERROR: Invalid data length. FATAL: Unrecoverable I/O error detected in the execution of the data step program. Aborted during the EXECUTION phase. NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.TEST may be incomplete. When this step was stopped there were 0 observations and 13 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds =20 =20 NOTE: The infile library IN is: Stream=3DF:\Cash\bosh.zip =20 NOTE: The infile IN(bosh.csv) is: File Name=3Dbosh.csv, Compressed Size=3D314,Uncompressed Size=3D1117, Compression Level=3D-1,Clear Text=3DYes =20 NOTE: A total of 11 records were read from the infile library IN. NOTE: 11 records were read from the infile IN(bosh.csv). NOTE: The data set WORK.TEST has 11 observations and 13 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.00 seconds =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Elvira - If copy the error message and paste it into an email I can be more specific in how to fix the promblem. For now I will make some assumptions. Perhaps it is giving an error about reading past the end of the line? Did you try changing MISSOVER to TRUNCOVER? Pehaps it is giving you an error about invalid numeric data? If you want to skip the empty lines then just make your program a little more complex. =20 * read line and hold it; input @; * ignore blank lines ; if _infile_=3D' ' then delete; input ..... ; =20 You might not know where the end of the file is but SAS does so use the END=3D option on the infile statement to set a temporary variable = with a boolean T/F value for whether you have reached the end of the line. Try this little program to dump into the log the last line.: =20 data _null_; infile ........ END=3DEOF ; input ; if eof then list; run; =20 To make a copy without the last line into a temporary file use something like this: =20 filename new temp; data _null_; infile .... END=3DEOF; input; if not eof then put _infile_; run; =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D Elvira, =20 I'd be glad to show an example of my suggestion but, before doing so, I have a comment, a question and an alternative approach (although you could end up still needing to do what I had suggested or something else). =20 I put together and compressed a small csv file and, to my surprise, saszipam wasn't able to read it regardless of whether it contained an extra line or not. =20 For some alternative command line possibilities, including freeware and shareware compression routines, take a look at: =20 www2.sas.com/proceedings/sugi31/155-31.pdf =20 i.e., before suggesting how you might skip the last line (which, by the way, may only entail using something like eof=3Dlastrec in your infile statement), what kind of error message are you receiving. It may be that saszipam simply no longer works. =20 Art ------- =20 On Thu, 7 Jan 2010 21:26:46 +0100, Elvira Sojli <ESojli(a)RSM.NL> wrote: =20 >Dear Tom and Art, >=20 >Thank you for your prompt replies. Indeed the problem is with the csv=20 >file itself, i.e. if I try to just open the csv without using the upzip procedure the same error message occurs. If I manually delete the last line then SAS can read the file easily. >=20 >Art, thanks for the suggestion but I am not sure how to implement it. I need some help on how to do the following: 'simply read the file, with saszipam, as lines of text and write all but the last line of each file to a new csv file'. To complicate things further, I don't know where the last line of each file occurs, and each file could have more than 3 million lines. >=20 >Thanks for the time :). >=20 >Kind regards, >Elvira >=20 >-----Original Message----- >From: SAS(r) Discussion on behalf of Tom Abernathy >Sent: Thu 07/01/2010 18:35 >To: SAS-L(a)LISTSERV.UGA.EDU >Subject: Re: Error in reading compressed file >=20 >Doesn't sound like a problem with the compression but with the file. >What is the actual error message? What happens if you unzip one of the=20 >files and read the unzipped file directly. Do you get the same error=20 >message? >Debug the program using the unzipped file and then try again reading=20 >from the zip file. >=20 >=20 >On Jan 7, 11:12 am, eso...(a)RSM.NL (Elvira Sojli) wrote: >> Dear all, >>=20 >> I am using saszipam to read zipped csv files (code at the end of email). I >> am getting an error message because there is an extra empty line in=20 >> the csv >> file. If I remove this line then I can unzip and read the csv,=20 >> otherwise I >> get an error message. I don't mind removing the line but I have over=20 >> 1000 files to unzip. Is there a different way to remove the extra=20 >> line from the >> zipped csv file? >>=20 >> Thanks for the time. >> Elvira >>=20 >> filename in saszipam 'F:\Cash\EuronextAmsterdam-BBO-200404-1.zip'; >> data test; >> infile in(EuronextAmsterdam-BBO-200404-1.csv) >> dlm =3D ';' MISSOVER DSD lrecl=3D32767 firstobs=3D2 ; >> informat Internal_code best32. ; >> informat ISIN_code $9.; >> informat Instrument_name $35. ; >> informat Quotation_place $9.; >> informat BBO_date yymmdd10.; >> informat BBO_time time12.; >> informat BBO_number best32.; >> informat Best_bid_price best32.; >> informat Best_ask_price best32.; >> informat Size_best_bid best32. ; >> informat Size_best_ask best32. ; >> informat Number_orders_best_bid best32. ; >> informat Number_orders_best_ask best32. ; >>=20 >> input >> Internal_code >> ISIN_code $ >> Instrument_name $ >> Quotation_place $ >> BBO_date $ >> BBO_time $ >> BBO_number >> Best_bid_price >> Best_ask_price >> Size_best_bid >> Size_best_ask >> Number_orders_best_bid >> Number_orders_best_ask >> ; >> run; =20 =0A--------------------------------Disclaimer------------------------------= --=0ADe informatie verzonden in dit e-mail bericht inclusief de bijlage(n)= is=0Avertrouwelijk en is uitsluitend bestemd voor de geadresseerde van= dit=0Abericht. Lees verder: http://www.eur.nl/email-disclaimer=0A=0AThe inf= ormation in this e-mail message is confidential and may be legally=0Aprivil= eged. Read more: http://www.eur.nl/english/email-disclaimer=0A--------------= ------------------------------------------------------------=0A
|
Pages: 1 Prev: Finding the same set of groups Next: Re. Error in reading compressed file |