Prev: Windows client connecting to Linux server (SAS/Connect)
Next: Excel Workbook Setting Error Message
From: Chang Chung on 9 Dec 2009 12:12 On Tue, 8 Dec 2009 21:38:21 -0500, Keintz, H. Mark <mkeintz(a)WHARTON.UPENN.EDU> wrote: >Paul: > >Your DATA step and proc sql results match each other, but they don't actually reproduce the merge specified by the OP, which was: > > data c; > merge a b; > by code; > run; > >The feature to note is that data set B has three records with code=145, and data set A has only two such records. The OP's merged dataset will have three records for this code, as does yours. But, in the OP's program, the last code-145 record of A (i.e. the 2nd record) is matched with all the "extra" records in dataset B. > >As a result, variables in the last (i.e. 2nd) code-145 record from A will have their values propagated to those "extra" code-145 records (i.e. the 3rd) generated by B. But your results will have missing values for the variables from A in such extra records. > >(I wonder if the OP realizes this property of many-to-many merge in the data step). > >Anyhow, I imagine it would be a very recherche' SQL program that would reproduce that property, and I hereby pre-emptively lavish high praise on the individual who can do it in a single CREATE TABLE statement, (and no preparatory DATA steps either). Hi, Mark, Thanks for your pre-emptive lavishing high praise! Refined and tasteful, I don't know about, though. Below ran on 9.2 TS1M0, W32_VSPRO platform. Cheers, Chang /* test data */ data a; input code manager$; cards; 145 max 145 xam 155 paul 165 kobe ; run; data b; input code assistant$; cards; 145 jerry 145 tracy 145 wade 155 chen 170 yi ; run; /* assumes that a and b are sorted by code */ data c_merge; merge a b; by code; run; /* also assumes that the undocumented monotonic() works */ proc sql; create table c_sql as select coalesce(a3.code, b3.code) as code, a3.manager, b3.assistant from (select a2.id, a2.obs, max(a2.obs) as max, a2.code, a2.manager from ( select monotonic() as id, min(calculated id) as min, calculated id - calculated min as obs, * from (select * from a union (select distinct(b.code) as code, ' ' as manager from b except select distinct(a.code) as code, ' ' as manager from a)) group by code) as a2 group by code) as a3, (select b2.id, b2.obs, max(b2.obs) as max, b2.code, b2.assistant from ( select monotonic() as id, min(calculated id) as min, calculated id - calculated min as obs, * from (select * from b union (select distinct(a.code) as code, ' ' as assistant from a except select distinct(b.code) as code, ' ' as assistant from b)) group by code) as b2 group by code) as b3 where (a3.code = b3.code) and ( (a3.obs = b3.obs) or (a3.obs>b3.obs and b3.obs=b3.max) or (a3.obs<b3.obs and a3.obs=a3.max)) order by a3.id, b3.id; quit; proc compare base=c_merge compare=c_sql; run; /* on log NOTE: No unequal values were found. All values compared are exactly equal. */
From: "Keintz, H. Mark" on 9 Dec 2009 12:28 So, after all the email exchanges, Chang has provided an answer to the OP's request, which was how to use PROC SQL to reproduce the behavior of the SAS MERGE statement when matching many-to-many. ALL HAIL CHANG We now know it can be done, (and, looking at the code below, I know I never want to do it.) Chang's solution once again demonstates the high level of skill and sense of humor intrinsic to this listserve. Regards, Mark > -----Original Message----- > From: Chang Chung [mailto:chang_y_chung(a)HOTMAIL.COM] > Sent: Wednesday, December 09, 2009 12:13 PM > To: SAS-L(a)LISTSERV.UGA.EDU; Keintz, H. Mark > Subject: Re: about full join problem > > On Tue, 8 Dec 2009 21:38:21 -0500, Keintz, H. Mark > <mkeintz(a)WHARTON.UPENN.EDU> wrote: > > >Paul: > > > >Your DATA step and proc sql results match each other, but they don't > actually reproduce the merge specified by the OP, which was: > > > > data c; > > merge a b; > > by code; > > run; > > > >The feature to note is that data set B has three records with > code=145, and > data set A has only two such records. The OP's merged dataset will > have > three records for this code, as does yours. But, in the OP's program, > the > last code-145 record of A (i.e. the 2nd record) is matched with all the > "extra" records in dataset B. > > > >As a result, variables in the last (i.e. 2nd) code-145 record from A > will > have their values propagated to those "extra" code-145 records (i.e. > the > 3rd) generated by B. But your results will have missing values for the > variables from A in such extra records. > > > >(I wonder if the OP realizes this property of many-to-many merge in > the > data step). > > > >Anyhow, I imagine it would be a very recherche' SQL program that would > reproduce that property, and I hereby pre-emptively lavish high praise > on > the individual who can do it in a single CREATE TABLE statement, (and > no > preparatory DATA steps either). > > Hi, Mark, > Thanks for your pre-emptive lavishing high praise! Refined and > tasteful, I > don't know about, though. Below ran on 9.2 TS1M0, W32_VSPRO platform. > Cheers, > Chang > > /* test data */ > data a; > input code manager$; > cards; > 145 max > 145 xam > 155 paul > 165 kobe > ; > run; > data b; > input code assistant$; > cards; > 145 jerry > 145 tracy > 145 wade > 155 chen > 170 yi > ; > run; > > /* assumes that a and b are sorted by code */ > data c_merge; > merge a b; > by code; > run; > > /* also assumes that the undocumented monotonic() works */ > proc sql; > create table c_sql as > select coalesce(a3.code, b3.code) as code, a3.manager, b3.assistant > from > (select a2.id, a2.obs, max(a2.obs) as max, a2.code, a2.manager > from ( > select monotonic() as id, min(calculated id) as min, > calculated id - calculated min as obs, * > from (select * from a > union > (select distinct(b.code) as code, ' ' as manager from b > except > select distinct(a.code) as code, ' ' as manager from a)) > group by code) as a2 > group by code) as a3, > (select b2.id, b2.obs, max(b2.obs) as max, b2.code, b2.assistant > from ( > select monotonic() as id, min(calculated id) as min, > calculated id - calculated min as obs, * > from (select * from b > union > (select distinct(a.code) as code, ' ' as assistant from a > except > select distinct(b.code) as code, ' ' as assistant from b)) > group by code) as b2 > group by code) as b3 > where (a3.code = b3.code) and ( > (a3.obs = b3.obs) or > (a3.obs>b3.obs and b3.obs=b3.max) or > (a3.obs<b3.obs and a3.obs=a3.max)) > order by a3.id, b3.id; > quit; > > proc compare base=c_merge compare=c_sql; > run; > /* on log > NOTE: No unequal values were found. All values compared are exactly > equal. > */
From: Sigurd Hermansen on 9 Dec 2009 18:18 Mark: I'll join the chorus. Even though I have gone to great lengths in the past to demonstrate SQL solutions to tricky database programming problems, I would not have had the patience to piece together this solution. I've admired Chang's programming skills for some time. This feat adds luster to his laurels. I'll agree with your evaluation and will go beyond it to suggest a caveat: don't use Chang's solution for production work. Aside from the fact that it requires the undocumented and deprecated MONOTONIC() function that I once carelessly exposed on SAS-L, it implements a really bad idea; that is, a many-many MERGE of two datasets. Ian Whitlock posted some years ago the only coherent explanation of what the SAS MERGE does with a many-many join problem that I have seen. His explanation made sense, but I found the MERGE itself in this context both overly complicated and difficult to remember. As best I recall, a long thread on the topic in SAS-L concluded with a near consensus (including Ian) rule. Don't use the SAS MERGE for a many-many join. I'll argue the case for that rule in more depth if anyone can give me sufficient reason for doing so. Though an interesting technical challenge, replicating the many-many MERGE result in a logical programming context (e.g. SQL) has as much value to database programmers as replicating the predictions of Ptolemaic astronomy with Copernican astronomy: http://galileo.rice.edu/sci/theories/ptolemaic_system.html . Thanks, Chang, for demonstrated just how convoluted the solution becomes. Reminds me of something that I might have done some years ago if only I had had the patience and skill .... In any event, now that I have the solution on paper, I may be able to use your parts of your solution in a SAS SQL course. This construct looks particularly interesting: select * from a union (select distinct(b.key) as key, ' ' as d from b except select distinct(a.key) as key, ' ' as d from a) It sets up a BY variable group for dataset that has the key intersection set b complement of a appended to distinct rows of a, with data values from b set to missing. (A longer list of non-key variables would require more assignments of missing values.) So in effect Chang is substituting a SAS BY group structure for the original many-many structure prior to a Cartesian product of two BY groups. Bravo! How well does the two dataset solution generalize to multiple variable key or three or more dataset cases? Even the mighty Chang may pass on that challenge! S -----Original Message----- From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Keintz, H. Mark Sent: Wednesday, December 09, 2009 12:29 PM To: SAS-L(a)LISTSERV.UGA.EDU Subject: Re: about full join problem So, after all the email exchanges, Chang has provided an answer to the OP's request, which was how to use PROC SQL to reproduce the behavior of the SAS MERGE statement when matching many-to-many. ALL HAIL CHANG We now know it can be done, (and, looking at the code below, I know I never want to do it.) Chang's solution once again demonstates the high level of skill and sense of humor intrinsic to this listserve. Regards, Mark > -----Original Message----- > From: Chang Chung [mailto:chang_y_chung(a)HOTMAIL.COM] > Sent: Wednesday, December 09, 2009 12:13 PM > To: SAS-L(a)LISTSERV.UGA.EDU; Keintz, H. Mark > Subject: Re: about full join problem > > On Tue, 8 Dec 2009 21:38:21 -0500, Keintz, H. Mark > <mkeintz(a)WHARTON.UPENN.EDU> wrote: > > >Paul: > > > >Your DATA step and proc sql results match each other, but they don't > actually reproduce the merge specified by the OP, which was: > > > > data c; > > merge a b; > > by code; > > run; > > > >The feature to note is that data set B has three records with > code=145, and > data set A has only two such records. The OP's merged dataset will > have > three records for this code, as does yours. But, in the OP's program, > the > last code-145 record of A (i.e. the 2nd record) is matched with all the > "extra" records in dataset B. > > > >As a result, variables in the last (i.e. 2nd) code-145 record from A > will > have their values propagated to those "extra" code-145 records (i.e. > the > 3rd) generated by B. But your results will have missing values for the > variables from A in such extra records. > > > >(I wonder if the OP realizes this property of many-to-many merge in > the > data step). > > > >Anyhow, I imagine it would be a very recherche' SQL program that would > reproduce that property, and I hereby pre-emptively lavish high praise > on > the individual who can do it in a single CREATE TABLE statement, (and > no > preparatory DATA steps either). > > Hi, Mark, > Thanks for your pre-emptive lavishing high praise! Refined and > tasteful, I > don't know about, though. Below ran on 9.2 TS1M0, W32_VSPRO platform. > Cheers, > Chang > > /* test data */ > data a; > input code manager$; > cards; > 145 max > 145 xam > 155 paul > 165 kobe > ; > run; > data b; > input code assistant$; > cards; > 145 jerry > 145 tracy > 145 wade > 155 chen > 170 yi > ; > run; > > /* assumes that a and b are sorted by code */ > data c_merge; > merge a b; > by code; > run; > > /* also assumes that the undocumented monotonic() works */ > proc sql; > create table c_sql as > select coalesce(a3.code, b3.code) as code, a3.manager, b3.assistant > from > (select a2.id, a2.obs, max(a2.obs) as max, a2.code, a2.manager > from ( > select monotonic() as id, min(calculated id) as min, > calculated id - calculated min as obs, * > from (select * from a > union > (select distinct(b.code) as code, ' ' as manager from b > except > select distinct(a.code) as code, ' ' as manager from a)) > group by code) as a2 > group by code) as a3, > (select b2.id, b2.obs, max(b2.obs) as max, b2.code, b2.assistant > from ( > select monotonic() as id, min(calculated id) as min, > calculated id - calculated min as obs, * > from (select * from b > union > (select distinct(a.code) as code, ' ' as assistant from a > except > select distinct(b.code) as code, ' ' as assistant from b)) > group by code) as b2 > group by code) as b3 > where (a3.code = b3.code) and ( > (a3.obs = b3.obs) or > (a3.obs>b3.obs and b3.obs=b3.max) or > (a3.obs<b3.obs and a3.obs=a3.max)) > order by a3.id, b3.id; > quit; > > proc compare base=c_merge compare=c_sql; > run; > /* on log > NOTE: No unequal values were found. All values compared are exactly > equal. > */
From: siyuan li on 9 Dec 2009 23:57 On 12ÔÂ10ÈÕ, ÉÏÎç1ʱ12·Ö, chang_y_ch...(a)HOTMAIL.COM (Chang Chung) wrote: > On Tue, 8 Dec 2009 21:38:21 -0500, Keintz, H. Mark > > <mkei...(a)WHARTON.UPENN.EDU> wrote: > >Paul: > > >Your DATA step and proc sql results match each other, but they don't > > actually reproduce the merge specified by the OP, which was: > > > data c; > > merge a b; > > by code; > > run; > > >The feature to note is that data set B has three records with code=145, and > > data set A has only two such records. The OP's merged dataset will have > three records for this code, as does yours. But, in the OP's program, the > last code-145 record of A (i.e. the 2nd record) is matched with all the > "extra" records in dataset B. > > >As a result, variables in the last (i.e. 2nd) code-145 record from A will > > have their values propagated to those "extra" code-145 records (i.e. the > 3rd) generated by B. But your results will have missing values for the > variables from A in such extra records. > > >(I wonder if the OP realizes this property of many-to-many merge in the > data step). > > >Anyhow, I imagine it would be a very recherche' SQL program that would > > reproduce that property, and I hereby pre-emptively lavish high praise on > the individual who can do it in a single CREATE TABLE statement, (and no > preparatory DATA steps either). > > Hi, Mark, > Thanks for your pre-emptive lavishing high praise! Refined and tasteful, I > don't know about, though. Below ran on 9.2 TS1M0, W32_VSPRO platform. > Cheers, > Chang > > /* test data */ > data a; > input code manager$; > cards; > 145 max > 145 xam > 155 paul > 165 kobe > ; > run; > data b; > input code assistant$; > cards; > 145 jerry > 145 tracy > 145 wade > 155 chen > 170 yi > ; > run; > > /* assumes that a and b are sorted by code */ > data c_merge; > merge a b; > by code; > run; > > /* also assumes that the undocumented monotonic() works */ > proc sql; > create table c_sql as > select coalesce(a3.code, b3.code) as code, a3.manager, b3.assistant > from > (select a2.id, a2.obs, max(a2.obs) as max, a2.code, a2.manager > from ( > select monotonic() as id, min(calculated id) as min, > calculated id - calculated min as obs, * > from (select * from a > union > (select distinct(b.code) as code, ' ' as manager from b > except > select distinct(a.code) as code, ' ' as manager from a)) > group by code) as a2 > group by code) as a3, > (select b2.id, b2.obs, max(b2.obs) as max, b2.code, b2.assistant > from ( > select monotonic() as id, min(calculated id) as min, > calculated id - calculated min as obs, * > from (select * from b > union > (select distinct(a.code) as code, ' ' as assistant from a > except > select distinct(b.code) as code, ' ' as assistant from b)) > group by code) as b2 > group by code) as b3 > where (a3.code = b3.code) and ( > (a3.obs = b3.obs) or > (a3.obs>b3.obs and b3.obs=b3.max) or > (a3.obs<b3.obs and a3.obs=a3.max)) > order by a3.id, b3.id; > quit; > > proc compare base=c_merge compare=c_sql; > run; > /* on log > NOTE: No unequal values were found. All values compared are exactly equal.. > */ thank you very much.thanks for helping me
From: giorgio taddia on 22 Dec 2009 12:47 good morning, i ve got a environment SAS 9.13 on unix, i 'd like to create another ENV , it s enough to copy the folder /meta/env/SASmain... to /meta/env2/SASmain... and then change all the config files like metadataserver.sh sas9.cfg...ect? regards
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: Windows client connecting to Linux server (SAS/Connect) Next: Excel Workbook Setting Error Message |