From: Joe Matise on 7 Oct 2009 16:13 1. Put the keep statement as a dataset option in the set statement - not sure if it saves time, but it might. 2. Try PROC COPY, again using a dataset option for the keep statement. Very similar to the PROC DATASETS copy statement, but I don't think the latter supports dataset options (though it might?). 3. Try creating views instead of datasets; if you're just doing it for some temporary purpose, this may be better. Doesn't help if you need to transport this file somewhere else, though. -Joe On Wed, Oct 7, 2009 at 2:45 PM, Claus Yeh <phoebe.caulfield42(a)gmail.com>wrote: > Dear all, > > I have a very large SAS dataset - 500,000 variables and 4000 > observations. > > I want to create smaller datasets that contains about 1000 to 10,000 > variables of the original 500,000 variable dataset. > > I used data step to do this but it was very very slow (I need to > create multiple smaller steps). > > ie. data small; > set large; > keep var1-var1000; > run; > > Is there a way to do it in Proc Dataset that can output the smaller > dataset much quicker? If there are other efficient ways, please let > me know too. > > thank you so much, > claus >
From: Michael Raithel on 7 Oct 2009 16:22 Dear SAS-L-ers, Claus Yeh, posted the following: > Dear all, > > I have a very large SAS dataset - 500,000 variables and 4000 > observations. > > I want to create smaller datasets that contains about 1000 to 10,000 > variables of the original 500,000 variable dataset. > > I used data step to do this but it was very very slow (I need to > create multiple smaller steps). > > ie. data small; > set large; > keep var1-var1000; > run; > > Is there a way to do it in Proc Dataset that can output the smaller > dataset much quicker? If there are other efficient ways, please let > me know too. > Claus, yeh, I can think of a way of doing this that will run so fast, that you will hear a sonic boom as the DATA Step reaches Mach I! And, it won't cost you one bit more of storage, to boot! How about using a DATA Step view? You could code: data smallarge/view=smallarge; set large; keep var1-var1000; run; ....which would create a DATA Step view file in the blink of an eye. Thereafter, you could use that view to surface only Var1 - Var1000 in future SAS PROCs or DATA Steps. Would that work for you, or are you going to wait for some other SAS-L-sharpie's clever-er suggestion? Claus, best of luck in all of your SAS endeavors! I hope that this suggestion proves helpful now, and in the future! Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Michael A. Raithel "The man who wrote the book on performance" E-mail: MichaelRaithel(a)westat.com Author: Tuning SAS Applications in the MVS Environment Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Edition http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172 Author: The Complete Guide to SAS Indexes http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ....fire all of your guns at once and explode into space... - Steppenwolf, Born to be Wild +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
From: Claus Yeh on 7 Oct 2009 18:12 On Oct 7, 1:22 pm, michaelrait...(a)WESTAT.COM (Michael Raithel) wrote: > Dear SAS-L-ers, > > Claus Yeh, posted the following: > > > > > Dear all, > > > I have a very large SAS dataset - 500,000 variables and 4000 > > observations. > > > I want to create smaller datasets that contains about 1000 to 10,000 > > variables of the original 500,000 variable dataset. > > > I used data step to do this but it was very very slow (I need to > > create multiple smaller steps). > > > ie. data small; > > set large; > > keep var1-var1000; > > run; > > > Is there a way to do it in Proc Dataset that can output the smaller > > dataset much quicker? If there are other efficient ways, please let > > me know too. > > Claus, yeh, I can think of a way of doing this that will run so fast, that you will hear a sonic boom as the DATA Step reaches Mach I! And, it won't cost you one bit more of storage, to boot! > > How about using a DATA Step view? You could code: > > data smallarge/view=smallarge; > set large; > keep var1-var1000; > run; > > ...which would create a DATA Step view file in the blink of an eye. Thereafter, you could use that view to surface only Var1 - Var1000 in future SAS PROCs or DATA Steps. > > Would that work for you, or are you going to wait for some other SAS-L-sharpie's clever-er suggestion? > > Claus, best of luck in all of your SAS endeavors! > > I hope that this suggestion proves helpful now, and in the future! > > Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk. > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Michael A. Raithel > "The man who wrote the book on performance" > E-mail: MichaelRait...(a)westat.com > > Author: Tuning SAS Applications in the MVS Environment > > Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Editionhttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172 > > Author: The Complete Guide to SAS Indexeshttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409 > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > ...fire all of your guns at once and explode into space... - Steppenwolf, Born to be Wild > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Hi Michael, Thank you so much. I will do some test runs for "view" by running pro logistic on it. thanks again, claus
From: Claus Yeh on 7 Oct 2009 18:45 On Oct 7, 1:22 pm, michaelrait...(a)WESTAT.COM (Michael Raithel) wrote: > Dear SAS-L-ers, > > Claus Yeh, posted the following: > > > > > Dear all, > > > I have a very large SAS dataset - 500,000 variables and 4000 > > observations. > > > I want to create smaller datasets that contains about 1000 to 10,000 > > variables of the original 500,000 variable dataset. > > > I used data step to do this but it was very very slow (I need to > > create multiple smaller steps). > > > ie. data small; > > set large; > > keep var1-var1000; > > run; > > > Is there a way to do it in Proc Dataset that can output the smaller > > dataset much quicker? If there are other efficient ways, please let > > me know too. > > Claus, yeh, I can think of a way of doing this that will run so fast, that you will hear a sonic boom as the DATA Step reaches Mach I! And, it won't cost you one bit more of storage, to boot! > > How about using a DATA Step view? You could code: > > data smallarge/view=smallarge; > set large; > keep var1-var1000; > run; > > ...which would create a DATA Step view file in the blink of an eye. Thereafter, you could use that view to surface only Var1 - Var1000 in future SAS PROCs or DATA Steps. > > Would that work for you, or are you going to wait for some other SAS-L-sharpie's clever-er suggestion? > > Claus, best of luck in all of your SAS endeavors! > > I hope that this suggestion proves helpful now, and in the future! > > Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk. > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Michael A. Raithel > "The man who wrote the book on performance" > E-mail: MichaelRait...(a)westat.com > > Author: Tuning SAS Applications in the MVS Environment > > Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Editionhttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172 > > Author: The Complete Guide to SAS Indexeshttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409 > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > ...fire all of your guns at once and explode into space... - Steppenwolf, Born to be Wild > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Hi Mike, I tried the "view". It did finish the the dataset fast but the subsequent procedures took quite long time. I was wondering if "indexing" the dataset would help? thanks, claus
From: Claus Yeh on 7 Oct 2009 18:46
On Oct 7, 1:22 pm, michaelrait...(a)WESTAT.COM (Michael Raithel) wrote: > Dear SAS-L-ers, > > Claus Yeh, posted the following: > > > > > Dear all, > > > I have a very large SAS dataset - 500,000 variables and 4000 > > observations. > > > I want to create smaller datasets that contains about 1000 to 10,000 > > variables of the original 500,000 variable dataset. > > > I used data step to do this but it was very very slow (I need to > > create multiple smaller steps). > > > ie. data small; > > set large; > > keep var1-var1000; > > run; > > > Is there a way to do it in Proc Dataset that can output the smaller > > dataset much quicker? If there are other efficient ways, please let > > me know too. > > Claus, yeh, I can think of a way of doing this that will run so fast, that you will hear a sonic boom as the DATA Step reaches Mach I! And, it won't cost you one bit more of storage, to boot! > > How about using a DATA Step view? You could code: > > data smallarge/view=smallarge; > set large; > keep var1-var1000; > run; > > ...which would create a DATA Step view file in the blink of an eye. Thereafter, you could use that view to surface only Var1 - Var1000 in future SAS PROCs or DATA Steps. > > Would that work for you, or are you going to wait for some other SAS-L-sharpie's clever-er suggestion? > > Claus, best of luck in all of your SAS endeavors! > > I hope that this suggestion proves helpful now, and in the future! > > Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk. > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Michael A. Raithel > "The man who wrote the book on performance" > E-mail: MichaelRait...(a)westat.com > > Author: Tuning SAS Applications in the MVS Environment > > Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Editionhttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172 > > Author: The Complete Guide to SAS Indexeshttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409 > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > ...fire all of your guns at once and explode into space... - Steppenwolf, Born to be Wild > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Hi Mike, I tried the "view". It did finish the the dataset fast but the subsequent procedures took quite long time. I was wondering if "indexing" the dataset would help? thanks, claus |