From: jiji on 7 Mar 2010 10:56 Hi, I am having difficulty with a many-to-many merge between two datasets. The two datasets actually have 2 variables in common (ID and VISNO), but I want to merge on ID only since I would lose some data from the first dataset if I merge with ID and VISNO. Here are both my merge and procSQL codes: proc sort data=dataset1; by ID; proc sort data=dataset2; by ID; Data want; Merge dataset1 (IN=in1) dataset2; By ID; if in1; Run; OR PROC SQL; CREATE TABLE want AS SELECT * FROM dataset1,dataset2 WHERE dataset1.ID=dataset2.ID ; QUIT; Here is a snapshot of how the data looks: dataset1: ID VISNO Varx 1 1 N 1 2 N 1 3 Y 1 4 N 1 5 N 2 1 N 2 2 Y 2 3 N 2 4 N 2 5 N dataset2: ID VISNO Varx 1 0 N 1 3 N 1 4 Y 1 6 N 1 8 N 2 0 N 2 4 Y 2 6 N 2 8 N 2 10 N
From: jiji on 7 Mar 2010 11:49 On Mar 7, 9:56 am, jiji <joelleahal...(a)gmail.com> wrote: > Hi, > > I am having difficulty with a many-to-many merge between two > datasets. > The two datasets actually have 2 variables in common (ID and VISNO), > but I want to merge on ID only since I would lose some data from the > first dataset if I merge with ID and VISNO. Here are both my merge and > procSQL codes: > > proc sort data=dataset1; by ID; > proc sort data=dataset2; by ID; > Data want; > Merge > dataset1 (IN=in1) > dataset2; > By ID; > if in1; > Run; > > OR > > PROC SQL; > CREATE TABLE want AS > SELECT * > FROM dataset1,dataset2 > WHERE dataset1.ID=dataset2.ID > ; > QUIT; > > Here is a snapshot of how the data looks: > dataset1: > > ID VISNO Varx > 1 1 N > 1 2 N > 1 3 Y > 1 4 N > 1 5 N > 2 1 N > 2 2 Y > 2 3 N > 2 4 N > 2 5 N > > dataset2: > > ID VISNO Vary > 1 0 N > 1 3 N > 1 4 Y > 1 6 N > 1 8 N > 2 0 N > 2 4 Y > 2 6 N > 2 8 N > 2 10 N I should have changed VarX to Var Y in dataset2. The two datasets do not have variables in common. I want the final dataset to look something like this ID VISNO Varx Vary 1 0 . N 1 1 N . 1 2 N . 1 3 Y N 1 4 N Y 1 5 N . 1 6 . N 1 8 . N
From: Arthur Tabachneck on 7 Mar 2010 11:39 jiji, Given the sample data you provided, what do you want the resulting file to look like? There are many possibilities and, for both ids, visno 4 has conflicting values for varx. Art ------------- On Mar 7, 10:56 am, jiji <joelleahal...(a)gmail.com> wrote: > Hi, > > I am having difficulty with a many-to-many merge between two > datasets. > The two datasets actually have 2 variables in common (ID and VISNO), > but I want to merge on ID only since I would lose some data from the > first dataset if I merge with ID and VISNO. Here are both my merge and > procSQL codes: > > proc sort data=dataset1; by ID; > proc sort data=dataset2; by ID; > Data want; > Merge > dataset1 (IN=in1) > dataset2; > By ID; > if in1; > Run; > > OR > > PROC SQL; > CREATE TABLE want AS > SELECT * > FROM dataset1,dataset2 > WHERE dataset1.ID=dataset2.ID > ; > QUIT; > > Here is a snapshot of how the data looks: > dataset1: > > ID VISNO Varx > 1 1 N > 1 2 N > 1 3 Y > 1 4 N > 1 5 N > 2 1 N > 2 2 Y > 2 3 N > 2 4 N > 2 5 N > > dataset2: > > ID VISNO Varx > 1 0 N > 1 3 N > 1 4 Y > 1 6 N > 1 8 N > 2 0 N > 2 4 Y > 2 6 N > 2 8 N > 2 10 N
From: jiji on 7 Mar 2010 12:42 On Mar 7, 10:39 am, art...(a)NETSCAPE.NET (Arthur Tabachneck) wrote: > jiji, > > Given the sample data you provided, what do you want the resulting > file to look like? There are many possibilities and, for both ids, > visno 4 has conflicting values for varx. > > Art > ------------- > On Mar 7, 10:56 am, jiji <joelleahal...(a)gmail.com> wrote: > > > Hi, > > > I am having difficulty with a many-to-many merge between two > > datasets. > > The two datasets actually have 2 variables in common (ID and VISNO), > > but I want to merge on ID only since I would lose some data from the > > first dataset if I merge with ID and VISNO. Here are both my merge and > > procSQL codes: > > > proc sort data=dataset1; by ID; > > proc sort data=dataset2; by ID; > > Data want; > > Merge > > dataset1 (IN=in1) > > dataset2; > > By ID; > > if in1; > > Run; > > > OR > > > PROC SQL; > > CREATE TABLE want AS > > SELECT * > > FROM dataset1,dataset2 > > WHERE dataset1.ID=dataset2.ID > > ; > > QUIT; > > > Here is a snapshot of how the data looks: > > dataset1: > > > ID VISNO Varx > > 1 1 N > > 1 2 N > > 1 3 Y > > 1 4 N > > 1 5 N > > 2 1 N > > 2 2 Y > > 2 3 N > > 2 4 N > > 2 5 N > > > dataset2: > > > ID VISNO Varx > > 1 0 N > > 1 3 N > > 1 4 Y > > 1 6 N > > 1 8 N > > 2 0 N > > 2 4 Y > > 2 6 N > > 2 8 N > > 2 10 N Now I see it in the discussion board: I should have changed VarX to Var Y in dataset2. The two datasets do not have variables in common. I want the final dataset to look something like this ID VISNO Varx Vary 1 0 . N 1 1 N . 1 2 N . 1 3 Y N 1 4 N Y 1 5 N . 1 6 . N 1 8 . N I also tried running using the if first.ID and if last.ID commands in the merge statement. That did not work either. Thanks much,
From: jiji on 7 Mar 2010 13:32
On Mar 7, 11:42 am, jiji <joelleahal...(a)gmail.com> wrote: > On Mar 7, 10:39 am, art...(a)NETSCAPE.NET (Arthur Tabachneck) wrote: > > > > > > > jiji, > > > Given the sample data you provided, what do you want the resulting > > file to look like? There are many possibilities and, for both ids, > > visno 4 has conflicting values for varx. > > > Art > > ------------- > > On Mar 7, 10:56 am, jiji <joelleahal...(a)gmail.com> wrote: > > > > Hi, > > > > I am having difficulty with a many-to-many merge between two > > > datasets. > > > The two datasets actually have 2 variables in common (ID and VISNO), > > > but I want to merge on ID only since I would lose some data from the > > > first dataset if I merge with ID and VISNO. Here are both my merge and > > > procSQL codes: > > > > proc sort data=dataset1; by ID; > > > proc sort data=dataset2; by ID; > > > Data want; > > > Merge > > > dataset1 (IN=in1) > > > dataset2; > > > By ID; > > > if in1; > > > Run; > > > > OR > > > > PROC SQL; > > > CREATE TABLE want AS > > > SELECT * > > > FROM dataset1,dataset2 > > > WHERE dataset1.ID=dataset2.ID > > > ; > > > QUIT; > > > > Here is a snapshot of how the data looks: > > > dataset1: > > > > ID VISNO Varx > > > 1 1 N > > > 1 2 N > > > 1 3 Y > > > 1 4 N > > > 1 5 N > > > 2 1 N > > > 2 2 Y > > > 2 3 N > > > 2 4 N > > > 2 5 N > > > > dataset2: > > > > ID VISNO Varx > > > 1 0 N > > > 1 3 N > > > 1 4 Y > > > 1 6 N > > > 1 8 N > > > 2 0 N > > > 2 4 Y > > > 2 6 N > > > 2 8 N > > > 2 10 N > > Now I see it in the discussion board: > I should have changed VarX to Var Y in dataset2. The two datasets do > not have variables in common. > > I want the final dataset to look something like this > ID VISNO Varx Vary > 1 0 . N > 1 1 N . > 1 2 N . > 1 3 Y N > 1 4 N Y > 1 5 N . > 1 6 . N > 1 8 . N > > I also tried running using the if first.ID and if last.ID commands in > the merge statement. That did not work either. > Thanks much,- Hide quoted text - > > - Show quoted text - Thank you Arthur, I actually did try that. and the data merged however I lost data fron all the visits from the first dataset. when I merge with ID and VISNO the data looked like this: ID VISNO Varx Vary 1 3 N N 1 4 Y Y I want to replace the visits that do not match between the two datasets with missing points so that I can then recode as needed. What do you think? Thanks!! |