Prev: Windows client connecting to Linux server (SAS/Connect)
Next: Excel Workbook Setting Error Message
From: "Choate, Paul on 8 Dec 2009 17:14 That wasn't cheating, it was creativity. Lateral thinking is a key component of good programming. ;p Paul Choate DDS Data Extraction (916) 654-2160 -----Original Message----- From: Toby Dunn [mailto:tobydunn(a)HOTMAIL.COM] Sent: Tuesday, December 08, 2009 1:35 PM To: SAS-L(a)LISTSERV.UGA.EDU; Choate, Paul(a)DDS Subject: Re: about full join problem Paul, You cheated in so much as you also used the Unique key on the data step merge. If you try to use your data in SQL and mimic the posters original data step your SQL code wont work. On Tue, 8 Dec 2009 13:24:59 -0800, Choate, Paul(a)DDS <Paul.Choate(a)DDS.CA.GOV> wrote: >As Toby said - SQL and the data step work differently - and while it's not to hard to make a data step mimic a SQL join the reverse isn't always so easy. > >The key is the uniqueness of the rows - so the most straightforward solution is to make your rows unique... > >data a; >input code manager $; >cards; >145 max >145 xam >155 paul >165 kobe >; > >data b; >input code assistant $; >cards; >145 jerry >145 tracy >145 wade >155 chen >170 yi >; > >data a; > set a; > by code; > if first.code then n=0; > n+1; >run; > >data b; > set b; > by code; > if first.code then n=0; > n+1; >run; > >data C_DATASTEP(drop=n); > merge a b; > by code n; >run; > >proc sql; > create TABLE C_SQL as > select COALESCE(a.code,b.code) as code, > a.manager, b.assistant > from a full join b > on a.code=b.code and a.n=b.n; >quit; > > > >Paul Choate >DDS Data Extraction >(916) 654-2160 > >-----Original Message----- >From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of siyuan li >Sent: Tuesday, December 08, 2009 4:44 AM >To: SAS-L(a)LISTSERV.UGA.EDU >Subject: about full join problem > >hi, I am sorry first for that I ask the same problem. >here has 2 datasets: > data a; >input code manager$; >cards; >145 max >145 xam >155 paul >165 kobe >; >run; >data b; >input code assistant$; >cards; >145 jerry >145 tracy >145 wade >155 chen >170 yi >; >run; >data c; >merge a b; >by code; >run; >the question is that I want to yield the same output to use proc sql >full join procedure, I try to use the procudure below,but it did not >yield the same output,can anyone help me? > proc sql; >select a.code,a.manager,b.assistant from a full join b >on a.code=b.code; >quit; > >thanks in advance
From: "Keintz, H. Mark" on 8 Dec 2009 21:38 Paul: Your DATA step and proc sql results match each other, but they don't actually reproduce the merge specified by the OP, which was: data c; merge a b; by code; run; The feature to note is that data set B has three records with code=145, and data set A has only two such records. The OP's merged dataset will have three records for this code, as does yours. But, in the OP's program, the last code-145 record of A (i.e. the 2nd record) is matched with all the "extra" records in dataset B. As a result, variables in the last (i.e. 2nd) code-145 record from A will have their values propagated to those "extra" code-145 records (i.e. the 3rd) generated by B. But your results will have missing values for the variables from A in such extra records. (I wonder if the OP realizes this property of many-to-many merge in the data step). Anyhow, I imagine it would be a very recherche' SQL program that would reproduce that property, and I hereby pre-emptively lavish high praise on the individual who can do it in a single CREATE TABLE statement, (and no preparatory DATA steps either). Regards, Mark > -----Original Message----- > From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of > Choate, Paul(a)DDS > Sent: Tuesday, December 08, 2009 5:15 PM > To: SAS-L(a)LISTSERV.UGA.EDU > Subject: Re: about full join problem > > That wasn't cheating, it was creativity. Lateral thinking is a key > component of good programming. ;p > > Paul Choate > DDS Data Extraction > (916) 654-2160 > > -----Original Message----- > From: Toby Dunn [mailto:tobydunn(a)HOTMAIL.COM] > Sent: Tuesday, December 08, 2009 1:35 PM > To: SAS-L(a)LISTSERV.UGA.EDU; Choate, Paul(a)DDS > Subject: Re: about full join problem > > Paul, > > You cheated in so much as you also used the Unique key on the data step > merge. If you try to use your data in SQL and mimic the posters > original > data step your SQL code wont work. > > On Tue, 8 Dec 2009 13:24:59 -0800, Choate, Paul(a)DDS > <Paul.Choate(a)DDS.CA.GOV> wrote: > > >As Toby said - SQL and the data step work differently - and while it's > not to hard to make a data step mimic a SQL join the reverse isn't > always > so easy. > > > >The key is the uniqueness of the rows - so the most straightforward > solution is to make your rows unique... > > > >data a; > >input code manager $; > >cards; > >145 max > >145 xam > >155 paul > >165 kobe > >; > > > >data b; > >input code assistant $; > >cards; > >145 jerry > >145 tracy > >145 wade > >155 chen > >170 yi > >; > > > >data a; > > set a; > > by code; > > if first.code then n=0; > > n+1; > >run; > > > >data b; > > set b; > > by code; > > if first.code then n=0; > > n+1; > >run; > > > >data C_DATASTEP(drop=n); > > merge a b; > > by code n; > >run; > > > >proc sql; > > create TABLE C_SQL as > > select COALESCE(a.code,b.code) as code, > > a.manager, b.assistant > > from a full join b > > on a.code=b.code and a.n=b.n; > >quit; > > > > > > > >Paul Choate > >DDS Data Extraction > >(916) 654-2160 > > > >-----Original Message----- > >From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of > siyuan li > >Sent: Tuesday, December 08, 2009 4:44 AM > >To: SAS-L(a)LISTSERV.UGA.EDU > >Subject: about full join problem > > > >hi, I am sorry first for that I ask the same problem. > >here has 2 datasets: > > data a; > >input code manager$; > >cards; > >145 max > >145 xam > >155 paul > >165 kobe > >; > >run; > >data b; > >input code assistant$; > >cards; > >145 jerry > >145 tracy > >145 wade > >155 chen > >170 yi > >; > >run; > >data c; > >merge a b; > >by code; > >run; > >the question is that I want to yield the same output to use proc sql > >full join procedure, I try to use the procudure below,but it did not > >yield the same output,can anyone help me? > > proc sql; > >select a.code,a.manager,b.assistant from a full join b > >on a.code=b.code; > >quit; > > > >thanks in advance
From: Craig Johnson on 8 Dec 2009 22:33 I'll heap some praise on someone if they can figure this out because I'm convinced it's impossible. Frankly, I'm confused on how one knows if the original data step merge is correct. If you look at dataset A you have code 145 for max and xam. In dataset B you have 145 for jerry, tracy, and wade. After the merge they end up with max with jerry and xam with tracy and wade. How is max only being grouped with jerry while xam is grouped with two? IE how does SAS know that xam has two and max only has one? IMO there is no way for SAS how to know what to merge what manager together with what assistant. Here is an example, if you add another case in for code 145 in dataset B (145 bob in line one of the cards) it is grouped with Max and Jerry is now grouped with xam. To me that indicates that SAS is merging the first 145 in A with the first 145 in B and then merging the rest of the 145s in B with the second 145 in A. In other words, the example was a fluke and the merge that is wanted can't be done because there is no PK/FK relationship between manager and assistant. Instead a secondary FK (code) is being improperly used to try and do a full join on a many-to-many relationship. In which case, the Cartesian product is technically correct.
From: siyuan li on 9 Dec 2009 08:24 On 12æ9æ¥, ä¸å5æ¶24å, Paul.Cho....(a)DDS.CA.GOV ("Choate, Paul(a)DDS") wrote: > As Toby said - SQL and the data step work differently - and while it's not to hard to make a data step mimic a SQL join the reverse isn't always so easy. > > The key is the uniqueness of the rows - so the most straightforward solution is to make your rows unique... > > data a; > input code manager $; > cards; > 145 max > 145 xam > 155 paul > 165 kobe > ; > > data b; > input code assistant $; > cards; > 145 jerry > 145 tracy > 145 wade > 155 chen > 170 yi > ; > > data a; >    set a; >    by code; >    if first.code then n=0; >    n+1; > run; > > data b; >    set b; >    by code; >    if first.code then n=0; >    n+1; > run; > > data C_DATASTEP(drop=n); >    merge a b; >    by code n; > run; > > proc sql; >  create TABLE C_SQL as >    select COALESCE(a.code,b.code) as code, >       a.manager, b.assistant >    from a full join b >    on a.code=b.code and a.n=b.n; > quit; > > Paul Choate > DDS Data Extraction > (916) 654-2160 > > > > -----Original Message----- > From: SAS(r) Discussion [mailto:SA...(a)LISTSERV.UGA.EDU] On Behalf Of siyuan li > Sent: Tuesday, December 08, 2009 4:44 AM > To: SA...(a)LISTSERV.UGA.EDU > Subject: about full join problem > > hi, I am sorry first for that I ask the same problem. > here has 2 datasets: >  data a; > input code manager$; > cards; > 145 max > 145 xam > 155 paul > 165 kobe > ; > run; > data b; > input code assistant$; > cards; > 145 jerry > 145 tracy > 145 wade > 155 chen > 170 yi > ; > run; > data c; > merge a b; > by code; > run; > the question is that I want to yield the same output to use proc sql > full join procedure, I try to use the procudure below,but it did not > yield the same output,can anyone help me? >  proc sql; > select a.code,a.manager,b.assistant from a full join b > on a.code=b.code; > quit; > > thanks in advance- éè被å¼ç¨æå - > > - æ¾ç¤ºå¼ç¨çæå - hi,paul, this n variables are then generated by your own, the original data set has no n variable, so the significance of such a merge merger is very slim. In addition, you such a situation, than those without when the merge should be simple n a lot of programs from your sql view, you can illustrate this point
From: "Keintz, H. Mark" on 9 Dec 2009 12:01 Proper is in the eyes of the beholder. And I agree that the behavior of me= rge In this case may not be what the OP wanted. But I can imagine cases where the behavior is appropriate. Maybe you have = two Datasets sorted by ID and DATE, and both datasets have the same first (say)= 4 dates for a given ID. But dataset B can have extra records, and you want = the most recent record from A to be matched with those records. I'm not saying this is why SAS made merge work the way it does, but it does= have its uses. From: Craig Johnson [mailto:cjohns38(a)gmail.com] Sent: Wednesday, December 09, 2009 11:49 AM To: Keintz, H. Mark Cc: SAS-L(a)LISTSERV.UGA.EDU Subject: Re: Do we need a NORETAIN statement? was: about full join problem That's true but it still doesn't answer my original question, how do we kno= w that this is properly merging the manager with the appropriate assistant?= Assuming that this is a subset of a dataset it seems unlikely that every = time there is more than one manager that the last manager should be matched= with all the remaining cases in the second data set. My gut tells me that= this type of merge isn't correct but without knowing how they are suppose= to properly merge (Manager to Assistant) I can't prove it.
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: Windows client connecting to Linux server (SAS/Connect) Next: Excel Workbook Setting Error Message |