From: Tom Abernathy on 8 Jan 2010 07:34 Try the PIPE again. First figure out the syntax on your machine that works from the command prompt. You want the syntax that will make zip output the contents of the unzipped file as the standard output. For example on Unix I would use this syntax: unzip -p EuronextAmsterdam-BBO-200404-2.zip EuronextAmsterdam- BBO-200404-2.csv Once you can get a command that works from the command prompt then use it in SAS in quotes in a FILENAME statement. filename in pipe "....." ; data ..... ; infile in ..... ; ..... run; It is possible that the Winzip program is running out of disk space. There should be ways to tell it which directory to use for writing its temporary files. This does not seem likely as you can unzip the file. Is is possible that SAS is running out of disk space. I would expect a different error message in that case, but you should check. Where is your SASWORK library pointed? Does it have enough space to store the dataset you are trying to create? - Tom On Jan 8, 6:02 am, ESo...(a)RSM.NL (Elvira Sojli) wrote: > Dear Art and Tom, > > =20 > > Thanks again for the quick replies. I have spent the morning going > through different things, trying to locate the problem more > specifically.=20 > > =20 > > 1. So like Art I created a dummy file using the first lines of my > problematic csv's and zipped with WINZIP (this is essential for saszipam > to work). I can unzip and read the file without a problem. The problem > reoccurs when the file is very long. The SAS error messages are below, > for the file that works and the one that doesn't.=20 > > 2. When I unzip the file and run the code, I can read the CSV without > problems regardless of the length of the file. I don't want to do this > for the thousands of files I have, which are all very large even zipped. > > =20 > > 3. I tried to use the PIPE command with winzip unzip and it does not > give any useful output as it is designed to read TXT files. > > =20 > > 4. Tom your suggestion of using TRUNCOVER or the delete option don't > work. I also tried *dlm=3D'2C0D'x TERMSTR=3DCRLF* end eof=3Dlastrec but = > they > also don't work. So I guess reading the csv as text might be my only > option left. > > =20 > > Thanks again for your time and effort. > > =20 > > Elvira=20 > > =20 > > NOTE: The infile library IN is: > > Stream=3DF:\Cash\EuronextAmsterdam-BBO-200404-2.zip > > =20 > > NOTE: The infile IN(EuronextAmsterdam-BBO-200404-2.csv) is: > > File Name=3DEuronextAmsterdam-BBO-200404-2.csv, > > Compressed Size=3D34188101, > > Uncompressed Size=3D404818647, > > Compression Level=3D-1,Clear Text=3DNo > > =20 > > ERROR: Invalid data length. > > FATAL: Unrecoverable I/O error detected in the execution of the data > step program. > > Aborted during the EXECUTION phase. > > NOTE: The SAS System stopped processing this step because of errors. > > WARNING: The data set WORK.TEST may be incomplete. When this step was > stopped there were 0 > > observations and 13 variables. > > NOTE: DATA statement used (Total process time): > > real time 0.01 seconds > > cpu time 0.01 seconds > > =20 > > =20 > > NOTE: The infile library IN is: > > Stream=3DF:\Cash\bosh.zip > > =20 > > NOTE: The infile IN(bosh.csv) is: > > File Name=3Dbosh.csv, > > Compressed Size=3D314,Uncompressed Size=3D1117, > > Compression Level=3D-1,Clear Text=3DYes > > =20 > > NOTE: A total of 11 records were read from the infile library IN. > > NOTE: 11 records were read from the infile IN(bosh.csv). > > NOTE: The data set WORK.TEST has 11 observations and 13 variables. > > NOTE: DATA statement used (Total process time): > > real time 0.01 seconds > > cpu time 0.00 seconds > > =20 > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > Elvira - > > If copy the error message and paste it into an email I can be more > specific in how to fix the promblem. > > For now I will make some assumptions. > > Perhaps it is giving an error about reading past the end of the line? > Did you try changing MISSOVER to TRUNCOVER? > > Pehaps it is giving you an error about invalid numeric data? > > If you want to skip the empty lines then just make your program a > little more complex. > > =20 > > * read line and hold it; > > input @; > > * ignore blank lines ; > > if _infile_=3D' ' then delete; > > input ..... ; > > =20 > > You might not know where the end of the file is but SAS does so use > the END=3D option on the infile statement to set a temporary variable = > with > a boolean T/F value for whether you have reached the end of the line. > Try this little program to dump into the log the last line.: > > =20 > > data _null_; > > infile ........ END=3DEOF ; > > input ; > > if eof then list; > > run; > > =20 > > To make a copy without the last line into a temporary file use > something like this: > > =20 > > filename new temp; > > data _null_; > > infile .... END=3DEOF; > > input; > > if not eof then put _infile_; > > run; > > =20 > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > =3D=3D > > Elvira, > > =20 > > I'd be glad to show an example of my suggestion but, before doing so, I > have a comment, a question and an alternative approach (although you > could end up still needing to do what I had suggested or something > else). > > =20 > > I put together and compressed a small csv file and, to my surprise, > saszipam wasn't able to read it regardless of whether it contained an > extra line or not. > > =20 > > For some alternative command line possibilities, including freeware and > shareware compression routines, take a look at: > > =20 > > www2.sas.com/proceedings/sugi31/155-31.pdf > > =20 > > i.e., before suggesting how you might skip the last line (which, by the > way, may only entail using something like eof=3Dlastrec in your infile > statement), what kind of error message are you receiving. It may be > that saszipam simply no longer works. > > =20 > > Art > > ------- > > =20 > > On Thu, 7 Jan 2010 21:26:46 +0100, Elvira Sojli <ESo...(a)RSM.NL> wrote: > > =20 > > >Dear Tom and Art, > >=20 > >Thank you for your prompt replies. Indeed the problem is with the csv=20 > >file > > itself, i.e. if I try to just open the csv without using the upzip > procedure the same error message occurs. If I manually delete the last > line then SAS can read the file easily. > > >=20 > >Art, thanks for the suggestion but I am not sure how to implement it. I > > need some help on how to do the following: 'simply read the file, with > saszipam, as lines of text and write all but the last line of each file > to a new csv file'. To complicate things further, I don't know where the > last line of each file occurs, and each file could have more than 3 > million lines. > > > > > > >=20 > >Thanks for the time :). > >=20 > >Kind regards, > >Elvira > >=20 > >-----Original Message----- > >From: SAS(r) Discussion on behalf of Tom Abernathy > >Sent: Thu 07/01/2010 18:35 > >To: SA...(a)LISTSERV.UGA.EDU > >Subject: Re: Error in reading compressed file > >=20 > >Doesn't sound like a problem with the compression but with the file. > >What is the actual error message? What happens if you unzip one of the=20 > >files and read the unzipped file directly. Do you get the same error=20 > >message? > >Debug the program using the unzipped file and then try again reading=20 > >from the zip file. > >=20 > >=20 > >On Jan 7, 11:12 am, eso...(a)RSM.NL (Elvira Sojli) wrote: > >> Dear all, > >>=20 > >> I am using saszipam to read zipped csv files (code at the end of > > email). > > I > > >> am getting an error message because there is an extra empty line in=20 > >> the > > csv > > >> file. If I remove this line then I can unzip and read the csv,=20 > >> otherwise > > I > > >> get an error message. I don't mind removing the line but I have over=20 > >> 1000 files to unzip. Is there a different way to remove the extra=20 > >> line from > > the > > > > > > >> zipped csv file? > >>=20 > >> Thanks for the time. > >> Elvira > >>=20 > >> filename in saszipam 'F:\Cash\EuronextAmsterdam-BBO-200404-1.zip'; > >> data test; > >> infile in(EuronextAmsterdam-BBO-200404-1.csv) > >> dlm =3D ';' MISSOVER DSD lrecl=3D32767 firstobs=3D2 ; > >> informat Internal_code best32. ; > >> informat ISIN_code $9.; > >> informat Instrument_name $35. ; > >> informat Quotation_place $9.; > >> informat BBO_date yymmdd10.; > >> informat BBO_time time12.; > >> informat BBO_number best32.; > >> informat Best_bid_price best32.; > >> informat Best_ask_price best32.; > >> informat Size_best_bid best32. ; > >> informat Size_best_ask best32. ; > >> informat Number_orders_best_bid best32. ; > >> informat Number_orders_best_ask best32. ; > >>=20 > >> input > >> Internal_code > >> ISIN_code $ > >> Instrument_name $ > >> Quotation_place $ > >> BBO_date $ > >> BBO_time $ > >> BBO_number > >> Best_bid_price > >> Best_ask_price > >> Size_best_bid > >> Size_best_ask > >> Number_orders_best_bid > >> Number_orders_best_ask > >> ; > >> run; > > =20 > > =0A--------------------------------Disclaimer------------------------------= > --=0ADe informatie verzonden in dit e-mail bericht inclusief de bijlage(n)= > is=0Avertrouwelijk en is uitsluitend bestemd voor de geadresseerde van= > dit=0Abericht. Lees verder:http://www.eur.nl/email-disclaimer=0A=0ATheinf= > ormation in this e-mail message is confidential and may be legally=0Aprivil= > eged. Read more:http://www.eur.nl/english/email-disclaimer=0A--------------= > ------------------------------------------------------------=0A- Hide quoted text - > > - Show quoted text -- Hide quoted text - > > - Show quoted text -
From: Arthur Tabachneck on 8 Jan 2010 07:38 Elvira, Using a pipe to run winzip can be used, I think, for uncompressing any kind of file -- but definitely a csv file (although yours has that extension is not really a comma separated file as I recall it is separated with semi- colons). Take a look at the example at: http://support.sas.com/kb/26/011.html Let us know if you still need to know how to re-write any of the files (i.e., deleting an extra line). Art ------- On Fri, 8 Jan 2010 12:02:52 +0100, Elvira Sojli <ESojli(a)RSM.NL> wrote: >Dear Art and Tom, > > > >Thanks again for the quick replies. I have spent the morning going >through different things, trying to locate the problem more >specifically. > > > >1. So like Art I created a dummy file using the first lines of my >problematic csv's and zipped with WINZIP (this is essential for saszipam >to work). I can unzip and read the file without a problem. The problem >reoccurs when the file is very long. The SAS error messages are below, >for the file that works and the one that doesn't. > >2. When I unzip the file and run the code, I can read the CSV without >problems regardless of the length of the file. I don't want to do this >for the thousands of files I have, which are all very large even zipped. > > > >3. I tried to use the PIPE command with winzip unzip and it does not >give any useful output as it is designed to read TXT files. > > > >4. Tom your suggestion of using TRUNCOVER or the delete option don't >work. I also tried *dlm='2C0D'x TERMSTR=CRLF* end eof=lastrec but they >also don't work. So I guess reading the csv as text might be my only >option left. > > > >Thanks again for your time and effort. > > > >Elvira > > > >NOTE: The infile library IN is: > > Stream=F:\Cash\EuronextAmsterdam-BBO-200404-2.zip > > > >NOTE: The infile IN(EuronextAmsterdam-BBO-200404-2.csv) is: > > File Name=EuronextAmsterdam-BBO-200404-2.csv, > > Compressed Size=34188101, > > Uncompressed Size=404818647, > > Compression Level=-1,Clear Text=No > > > >ERROR: Invalid data length. > >FATAL: Unrecoverable I/O error detected in the execution of the data >step program. > > Aborted during the EXECUTION phase. > >NOTE: The SAS System stopped processing this step because of errors. > >WARNING: The data set WORK.TEST may be incomplete. When this step was >stopped there were 0 > > observations and 13 variables. > >NOTE: DATA statement used (Total process time): > > real time 0.01 seconds > > cpu time 0.01 seconds > > > > > >NOTE: The infile library IN is: > > Stream=F:\Cash\bosh.zip > > > >NOTE: The infile IN(bosh.csv) is: > > File Name=bosh.csv, > > Compressed Size=314,Uncompressed Size=1117, > > Compression Level=-1,Clear Text=Yes > > > >NOTE: A total of 11 records were read from the infile library IN. > >NOTE: 11 records were read from the infile IN(bosh.csv). > >NOTE: The data set WORK.TEST has 11 observations and 13 variables. > >NOTE: DATA statement used (Total process time): > > real time 0.01 seconds > > cpu time 0.00 seconds > > > >====================================================================== > >Elvira - > > If copy the error message and paste it into an email I can be more >specific in how to fix the promblem. > > For now I will make some assumptions. > > Perhaps it is giving an error about reading past the end of the line? >Did you try changing MISSOVER to TRUNCOVER? > > Pehaps it is giving you an error about invalid numeric data? > > If you want to skip the empty lines then just make your program a >little more complex. > > > > * read line and hold it; > > input @; > >* ignore blank lines ; > > if _infile_=' ' then delete; > > input ..... ; > > > > You might not know where the end of the file is but SAS does so use >the END= option on the infile statement to set a temporary variable with >a boolean T/F value for whether you have reached the end of the line. >Try this little program to dump into the log the last line.: > > > > data _null_; > > infile ........ END=EOF ; > > input ; > > if eof then list; > > run; > > > > To make a copy without the last line into a temporary file use >something like this: > > > > filename new temp; > > data _null_; > > infile .... END=EOF; > > input; > > if not eof then put _infile_; > > run; > > > >======================================================================== >== > >Elvira, > > > >I'd be glad to show an example of my suggestion but, before doing so, I >have a comment, a question and an alternative approach (although you >could end up still needing to do what I had suggested or something >else). > > > >I put together and compressed a small csv file and, to my surprise, >saszipam wasn't able to read it regardless of whether it contained an >extra line or not. > > > >For some alternative command line possibilities, including freeware and >shareware compression routines, take a look at: > > > >www2.sas.com/proceedings/sugi31/155-31.pdf > > > >i.e., before suggesting how you might skip the last line (which, by the >way, may only entail using something like eof=lastrec in your infile >statement), what kind of error message are you receiving. It may be >that saszipam simply no longer works. > > > >Art > >------- > > > >On Thu, 7 Jan 2010 21:26:46 +0100, Elvira Sojli <ESojli(a)RSM.NL> wrote: > > > >>Dear Tom and Art, > >> > >>Thank you for your prompt replies. Indeed the problem is with the csv > >>file > >itself, i.e. if I try to just open the csv without using the upzip >procedure the same error message occurs. If I manually delete the last >line then SAS can read the file easily. > >> > >>Art, thanks for the suggestion but I am not sure how to implement it. I > >need some help on how to do the following: 'simply read the file, with >saszipam, as lines of text and write all but the last line of each file >to a new csv file'. To complicate things further, I don't know where the >last line of each file occurs, and each file could have more than 3 >million lines. > >> > >>Thanks for the time :). > >> > >>Kind regards, > >>Elvira > >> > >>-----Original Message----- > >>From: SAS(r) Discussion on behalf of Tom Abernathy > >>Sent: Thu 07/01/2010 18:35 > >>To: SAS-L(a)LISTSERV.UGA.EDU > >>Subject: Re: Error in reading compressed file > >> > >>Doesn't sound like a problem with the compression but with the file. > >>What is the actual error message? What happens if you unzip one of the > >>files and read the unzipped file directly. Do you get the same error > >>message? > >>Debug the program using the unzipped file and then try again reading > >>from the zip file. > >> > >> > >>On Jan 7, 11:12 am, eso...(a)RSM.NL (Elvira Sojli) wrote: > >>> Dear all, > >>> > >>> I am using saszipam to read zipped csv files (code at the end of >email). > >I > >>> am getting an error message because there is an extra empty line in > >>> the > >csv > >>> file. If I remove this line then I can unzip and read the csv, > >>> otherwise > >I > >>> get an error message. I don't mind removing the line but I have over > >>> 1000 files to unzip. Is there a different way to remove the extra > >>> line from > >the > >>> zipped csv file? > >>> > >>> Thanks for the time. > >>> Elvira > >>> > >>> filename in saszipam 'F:\Cash\EuronextAmsterdam-BBO-200404-1.zip'; > >>> data test; > >>> infile in(EuronextAmsterdam-BBO-200404-1.csv) > >>> dlm = ';' MISSOVER DSD lrecl=32767 firstobs=2 ; > >>> informat Internal_code best32. ; > >>> informat ISIN_code $9.; > >>> informat Instrument_name $35. ; > >>> informat Quotation_place $9.; > >>> informat BBO_date yymmdd10.; > >>> informat BBO_time time12.; > >>> informat BBO_number best32.; > >>> informat Best_bid_price best32.; > >>> informat Best_ask_price best32.; > >>> informat Size_best_bid best32. ; > >>> informat Size_best_ask best32. ; > >>> informat Number_orders_best_bid best32. ; > >>> informat Number_orders_best_ask best32. ; > >>> > >>> input > >>> Internal_code > >>> ISIN_code $ > >>> Instrument_name $ > >>> Quotation_place $ > >>> BBO_date $ > >>> BBO_time $ > >>> BBO_number > >>> Best_bid_price > >>> Best_ask_price > >>> Size_best_bid > >>> Size_best_ask > >>> Number_orders_best_bid > >>> Number_orders_best_ask > >>> ; > >>> run; > > > > > --------------------------------Disclaimer-------------------------------- De informatie verzonden in dit e-mail bericht inclusief de bijlage(n) is vertrouwelijk en is uitsluitend bestemd voor de geadresseerde van dit bericht. Lees verder: http://www.eur.nl/email-disclaimer The information in this e-mail message is confidential and may be legally privileged. Read more: http://www.eur.nl/english/email-disclaimer --------------------------------------------------------------------------
From: Patrick on 8 Jan 2010 09:44 Elvira Using filename pipe together with Winzip works. I did this already with SAS under Windows. You can zip/unzip any file - but of course the unzipped file must then be something which you can also read with a SAS datastep (not some jpg or so). You need the Winzip Command Line Support Add-On installed to use Winzip from within SAS ( http://www.winzip.com/prodpagecl.htm ). I think Tom gave you good advise to "First figure out the syntax on your machine that works from the command prompt" and then to copy this syntax into the pipe. Quoting the "unzip" syntax can be a bit tricky - have a look at "full code" in the link Art gave you. To test if unzipping work as such you could use the following syntax: filename test pipe '.....'; data _null_; infile test; file print; input; put _infile_; run; This will show you the unzipped file "1:1" - and allow to better decide how to map variables against the "raw data" and what character should be used as delimiter. HTH Patrick
|
Pages: 1 Prev: Re. Error in reading compressed file Next: SAS EG Graph problem?? |