From: rjf2 on 29 Jan 2010 11:48 > From: jiyuan123 > Sent: Friday, January 01, 2010 2:28 AM > To: SAS-L > Subject: Who can give me some advices SASHELP View and Dictionary > tables? > > Dear SAS-Lers, > > I have some doubt about SASHELP View and Dictionary tables ? > > what is different about these? > > and What should I use? > > thanks a lot. sashelp.views are deprecated this page contains code that shows dictionary tables are an order of magnitude faster than sashelp.views http://www.sascommunity.org/wiki/List_Processing_Development#Speed_Consi derations Ron Fehd the module/routine/subroutine maven CDC Atlanta GA USA RJF2 at cdc dot gov
From: Jack Hamilton on 29 Jan 2010 21:19 On Jan 29, 2010, at 8:48 am, Fehd, Ronald J. (CDC/CCHIS/NCPHI) wrote: > sashelp.views are deprecated Where does SAS Institute say that sashelp views are deprecated? -- Jack Hamilton jfh(a)alumni.stanford.org Caelum non animum mutant qui trans mare currunt. On Jan 29, 2010, at 8:48 am, Fehd, Ronald J. (CDC/CCHIS/NCPHI) wrote: >> From: jiyuan123 >> Sent: Friday, January 01, 2010 2:28 AM >> To: SAS-L >> Subject: Who can give me some advices SASHELP View and Dictionary >> tables? >> >> Dear SAS-Lers, >> >> I have some doubt about SASHELP View and Dictionary tables ? >> >> what is different about these? >> >> and What should I use? >> >> thanks a lot. > > sashelp.views are deprecated > > this page contains code that shows > dictionary tables are an order of magnitude faster > than sashelp.views > > http://www.sascommunity.org/wiki/List_Processing_Development#Speed_Consi > derations > > Ron Fehd the module/routine/subroutine maven CDC Atlanta GA USA RJF2 > at cdc dot gov
From: Frank DiIorio on 31 Jan 2010 09:30 On Jan 29, 9:19 pm, j...(a)STANFORDALUMNI.ORG (Jack Hamilton) wrote: > On Jan 29, 2010, at 8:48 am, Fehd, Ronald J. (CDC/CCHIS/NCPHI) wrote: > > > sashelp.views are deprecated > > Where does SAS Institute say that sashelp views are deprecated? > I never heard that term per se, but it's clear from the V9 documentation that they encourage use of SQL (using Tables) over non- SQL (using Views). Back in the V6 days, when the Tables were introduced, Nancy Michal (now Nancy Cole) and I wrote a paper describing the structure and uses of some of the Tables. We also did some comparisons of efficiency running VMS, Unix, Windows, and OS/2, comparing timings of creating a report from CATALOGS using SQL, a DATA step, PROC PRINT, etc. The SQL-based scenario (create a WORK dataset, then use PRINT) ran, on average, in 1/4 the time of the next-fastest approach. Here's an excerpt from the V9.1 documentation: DICTIONARY Tables and Performance When you query a DICTIONARY table, SAS gathers information that is pertinent to that table. Depending on the DICTIONARY table that is being queried, this process can include searching libraries, opening tables, and executing views. Unlike other SAS procedures and the DATA step, PROC SQL can improve this process by optimizing the query before the select process is launched. Therefore, although it is possible to access DICTIONARY table information with SAS procedures or the DATA step by using the SASHELP views, it is often more efficient to use PROC SQL instead. For example, the following programs both produce the same result, but the PROC SQL step runs much faster because the WHERE clause is processed prior to opening the tables that are referenced by the SASHELP.VCOLUMN view: data mytable; set sashelp.vcolumn; where libname='WORK' and memname='SALES'; run; proc sql; create table mytable as select * from sashelp.vcolumn where libname='WORK' and memname='SALES'; quit; Note: SAS does not maintain DICTIONARY table information between queries. Each query of a DICTIONARY table launches a new discovery process. If you are querying the same DICTIONARY table several times in a row, you can get even faster performance by creating a temporary SAS data set (with the DATA step SET statement or PROC SQL CREATE TABLE AS statement) with the information that you desire and run your query against that data set.
From: Jack Hamilton on 31 Jan 2010 12:58 Yes, it is clear that SAS Institute thinks that dictionary views execute more quickly, but I cannot read "X is faster than Y" as "You should not use Y". There are various reasons why execution speed is not the only factor in choosing a particular coding solution. And, dictionary tables are only a partial replacement for sashelp views. The complete replacement is dictionary views + a PROC SQL step to create a view that can be used elsewhere. That's more code to write and keep track of. ===== 47 data test1; 48 set sashelp.vlibnam; 49 run; NOTE: There were 20 observations read from the data set SASHELP.VLIBNAM. NOTE: The data set WORK.TEST1 has 20 observations and 11 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.03 seconds 50 51 proc sql; 52 create table test2 as 53 select * from dictionary.libnames; NOTE: Table WORK.TEST2 created, with 20 rows and 11 columns. 54 quit; NOTE: PROCEDURE SQL used (Total process time): real time 0.03 seconds cpu time 0.01 seconds ===== In this case, use of the dictionary table saves one one-hundredth of a second of execution time but requires four times as much code. It is hard to see the why the dictionary table solution might be considered superior. -- Jack Hamilton jfh(a)alumni.stanford.org Caelum non animum mutant qui trans mare currunt. On Jan 31, 2010, at 6:30 am, Frank DiIorio wrote: > On Jan 29, 9:19 pm, j...(a)STANFORDALUMNI.ORG (Jack Hamilton) wrote: >> On Jan 29, 2010, at 8:48 am, Fehd, Ronald J. (CDC/CCHIS/NCPHI) wrote: >> >>> sashelp.views are deprecated >> >> Where does SAS Institute say that sashelp views are deprecated? >> > > I never heard that term per se, but it's clear from the V9 > documentation that they encourage use of SQL (using Tables) over non- > SQL (using Views). Back in the V6 days, when the Tables were > introduced, Nancy Michal (now Nancy Cole) and I wrote a paper > describing the structure and uses of some of the Tables. We also did > some comparisons of efficiency running VMS, Unix, Windows, and OS/2, > comparing timings of creating a report from CATALOGS using SQL, a DATA > step, PROC PRINT, etc. The SQL-based scenario (create a WORK dataset, > then use PRINT) ran, on average, in 1/4 the time of the next-fastest > approach. > > Here's an excerpt from the V9.1 documentation: > > DICTIONARY Tables and Performance > > When you query a DICTIONARY table, SAS gathers information that is > pertinent to that table. Depending on the DICTIONARY table that is > being queried, this process can include searching libraries, opening > tables, and executing views. Unlike other SAS procedures and the DATA > step, PROC SQL can improve this process by optimizing the query before > the select process is launched. Therefore, although it is possible to > access DICTIONARY table information with SAS procedures or the DATA > step by using the SASHELP views, it is often more efficient to use > PROC SQL instead. > > For example, the following programs both produce the same result, but > the PROC SQL step runs much faster because the WHERE clause is > processed prior to opening the tables that are referenced by the > SASHELP.VCOLUMN view: > > data mytable; > set sashelp.vcolumn; > where libname='WORK' and memname='SALES'; > run; > > proc sql; > create table mytable as > select * from sashelp.vcolumn > where libname='WORK' and memname='SALES'; > quit; > > Note: SAS does not maintain DICTIONARY table information between > queries. Each query of a DICTIONARY table launches a new discovery > process. > > If you are querying the same DICTIONARY table several times in a row, > you can get even faster performance by creating a temporary SAS data > set (with the DATA step SET statement or PROC SQL CREATE TABLE AS > statement) with the information that you desire and run your query > against that data set.
From: rjf2 on 1 Feb 2010 10:27
> From: Frank DiIorio > Subject: Re: Who can give me some advices SASHELP View and Dictionary > > Note: SAS does not maintain DICTIONARY table information between > queries. Each query of a DICTIONARY table launches a new discovery > process. > > If you are querying the same DICTIONARY table several times in a row, > you can get even faster performance by creating a temporary SAS data > set (with the DATA step SET statement or PROC SQL CREATE TABLE AS > statement) with the information that you desire and run your query > against that data set. I agree w/that opinion, Frank. Several years ago I had to convert a large everything-is-in-this-one program of about 800 lines. my write-one-to-throw-away solution had a continuous loop where I was querying dictionary.tables to find which of three data sets had just been created. worked great in unit and integration testing but the end-time kept receeding toward the horizon when I put it in production because of the growth of the number of library and work data sets and sql generating (yet another) update of the list of data names. my eventual solution eliminated calls to the (re)build of dictionary.tables by setting macro variables for each of the three data sets to value=not-exist and checking later whether they had been updated to value=exist. There is always another way to program a solution. Ron Fehd the module/routine/subroutine maven CDC Atlanta GA USA RJF2 at cdc dot gov |