Prev: A manupulation with RETAIN Satement ( without your help I
Next: df for confidence interval with a random effect (maybe Satterthwaite)
From: Claus Yeh on 10 Mar 2010 16:35 Dear SAS users, Hi, I have an ascii file that looks like this (about 2 million columns and 300 rows) A C G T... C T C A... Basically Each letter is separated by a space However, I want my SAS dataset to look this Var1 Var2 A C G T C T C A where each variable has two letters. Is there a way to force SAS to read in the increment of 2 letters even though they are all separated by a space? I am a little hesitant to use @ because there are 2 million columns. thank you so much, claus
From: Alex on 11 Mar 2010 10:17 On Mar 10, 10:35 pm, Claus Yeh <phoebe.caulfiel...(a)gmail.com> wrote: > Dear SAS users, > > Hi, I have an ascii file that looks like this (about 2 million columns > and 300 rows) > > A C G T... > C T C A... > > Basically Each letter is separated by a space > > However, I want my SAS dataset to look this > > Var1 Var2 > A C G T > C T C A > > where each variable has two letters. Is there a way to force SAS to > read in the increment of 2 letters even though they are all separated > by a space? I am a little hesitant to use @ because there are 2 > million columns. > > thank you so much, > claus Hi Claus, I'm not an expert in reading raw data, but you could create your variables directly from the input buffer. I'm not sure, how this will perform with millions of variables, but it works with a small example file. Please see the code below. Best, Alex filename have 'd:\temp\test.txt' ; data _null_; file have ; put 'A C G T' ; put 'C T C A' ; put 'A T C T' ; put 'T T C A' ; run; %let n_vars = 2 ; data want (keep = Var: ); infile have ; input ; length Var1-Var&n_vars $ 3 ; array Var ( &n_vars ) $; do i = 1 to dim(Var) ; Var(i) = catx(' ', scan( _infile_, i, ' ' ), scan( _infile_, i+1, ' ' ) ); end; run; proc print; run;
From: Alex on 11 Mar 2010 10:43 On Mar 11, 4:17 pm, Alex <alexander.k...(a)gmail.com> wrote: > On Mar 10, 10:35 pm, Claus Yeh <phoebe.caulfiel...(a)gmail.com> wrote: > > > > > Dear SAS users, > > > Hi, I have an ascii file that looks like this (about 2 million columns > > and 300 rows) > > > A C G T... > > C T C A... > > > Basically Each letter is separated by a space > > > However, I want my SAS dataset to look this > > > Var1 Var2 > > A C G T > > C T C A > > > where each variable has two letters. Is there a way to force SAS to > > read in the increment of 2 letters even though they are all separated > > by a space? I am a little hesitant to use @ because there are 2 > > million columns. > > > thank you so much, > > claus > > Hi Claus, > > I'm not an expert in reading raw data, but you could create your > variables directly from the input buffer. I'm not sure, how this will > perform with millions of variables, but it works with a small example > file. Please see the code below. > > Best, > Alex > > filename have 'd:\temp\test.txt' ; > > data _null_; > file have ; > put 'A C G T' ; > put 'C T C A' ; > put 'A T C T' ; > put 'T T C A' ; > run; > > %let n_vars = 2 ; > > data want (keep = Var: ); > infile have ; > input ; > > length Var1-Var&n_vars $ 3 ; > array Var ( &n_vars ) $; > > do i = 1 to dim(Var) ; > Var(i) = catx(' ', scan( _infile_, i, ' ' ), scan( _infile_, i+1, > ' ' ) ); > end; > run; > > proc print; > run; Oops, the position argument in the scan()s was incorrect. Please use this code instead: filename have 'd:\temp\test.txt' ; data _null_; file have ; put 'A C G T' ; put 'C T C A' ; put 'A T C T' ; put 'T T C A' ; run; %let n_vars = 2 ; data want (keep = Var: ); infile have ; input ; length Var1-Var&n_vars $ 3 ; array Var ( &n_vars ) $; do i = 1 to dim(Var) ; Var(i) = catx(' ', scan( _infile_, i*2-1, ' ' ), scan( _infile_, i*2, ' ' ) ); end; run; proc print; run;
From: Claus Yeh on 11 Mar 2010 14:31
On Mar 11, 7:43 am, Alex <alexander.k...(a)gmail.com> wrote: > On Mar 11, 4:17 pm, Alex <alexander.k...(a)gmail.com> wrote: > > > > > On Mar 10, 10:35 pm, Claus Yeh <phoebe.caulfiel...(a)gmail.com> wrote: > > > > Dear SAS users, > > > > Hi, I have an ascii file that looks like this (about 2 million columns > > > and 300 rows) > > > > A C G T... > > > C T C A... > > > > Basically Each letter is separated by a space > > > > However, I want my SAS dataset to look this > > > > Var1 Var2 > > > A C G T > > > C T C A > > > > where each variable has two letters. Is there a way to force SAS to > > > read in the increment of 2 letters even though they are all separated > > > by a space? I am a little hesitant to use @ because there are 2 > > > million columns. > > > > thank you so much, > > > claus > > > Hi Claus, > > > I'm not an expert in reading raw data, but you could create your > > variables directly from the input buffer. I'm not sure, how this will > > perform with millions of variables, but it works with a small example > > file. Please see the code below. > > > Best, > > Alex > > > filename have 'd:\temp\test.txt' ; > > > data _null_; > > file have ; > > put 'A C G T' ; > > put 'C T C A' ; > > put 'A T C T' ; > > put 'T T C A' ; > > run; > > > %let n_vars = 2 ; > > > data want (keep = Var: ); > > infile have ; > > input ; > > > length Var1-Var&n_vars $ 3 ; > > array Var ( &n_vars ) $; > > > do i = 1 to dim(Var) ; > > Var(i) = catx(' ', scan( _infile_, i, ' ' ), scan( _infile_, i+1, > > ' ' ) ); > > end; > > run; > > > proc print; > > run; > > Oops, the position argument in the scan()s was incorrect. Please use > this code instead: > > filename have 'd:\temp\test.txt' ; > > data _null_; > file have ; > put 'A C G T' ; > put 'C T C A' ; > put 'A T C T' ; > put 'T T C A' ; > run; > > %let n_vars = 2 ; > > data want (keep = Var: ); > infile have ; > input ; > > length Var1-Var&n_vars $ 3 ; > array Var ( &n_vars ) $; > > do i = 1 to dim(Var) ; > Var(i) = catx(' ', scan( _infile_, i*2-1, ' ' ), scan( _infile_, > i*2, ' ' ) ); > end; > run; > > proc print; > run; Thank you so much Alex. Your method works really well. _infile_ has a limit of 32000 so maybe we can add a nested loop? The lrecl length per record is about 2.4 million (600K variables times 4) thanks, claus |