From: jiji on
Hi,

I am having difficulty with a many-to-many merge between two
datasets.
The two datasets actually have 2 variables in common (ID and VISNO),
but I want to merge on ID only since I would lose some data from the
first dataset if I merge with ID and VISNO. Here are both my merge and
procSQL codes:

proc sort data=dataset1; by ID;
proc sort data=dataset2; by ID;
Data want;
Merge
dataset1 (IN=in1)
dataset2;
By ID;
if in1;
Run;

OR

PROC SQL;
CREATE TABLE want AS
SELECT *
FROM dataset1,dataset2
WHERE dataset1.ID=dataset2.ID
;
QUIT;

Here is a snapshot of how the data looks:
dataset1:

ID VISNO Varx
1 1 N
1 2 N
1 3 Y
1 4 N
1 5 N
2 1 N
2 2 Y
2 3 N
2 4 N
2 5 N

dataset2:

ID VISNO Varx
1 0 N
1 3 N
1 4 Y
1 6 N
1 8 N
2 0 N
2 4 Y
2 6 N
2 8 N
2 10 N
From: jiji on
On Mar 7, 9:56 am, jiji <joelleahal...(a)gmail.com> wrote:
> Hi,
>
> I am having difficulty with a many-to-many merge between two
> datasets.
> The two datasets actually have 2 variables in common (ID and VISNO),
> but I want to merge on ID only since I would lose some data from the
> first dataset if I merge with ID and VISNO. Here are both my merge and
> procSQL codes:
>
> proc sort data=dataset1; by ID;
> proc sort data=dataset2; by ID;
> Data want;
> Merge
> dataset1 (IN=in1)
> dataset2;
> By ID;
> if in1;
> Run;
>
> OR
>
> PROC SQL;
> CREATE TABLE want AS
> SELECT *
> FROM dataset1,dataset2
> WHERE dataset1.ID=dataset2.ID
> ;
> QUIT;
>
> Here is a snapshot of how the data looks:
> dataset1:
>
> ID    VISNO      Varx
> 1        1             N
> 1        2             N
> 1        3             Y
> 1        4             N
> 1        5             N
> 2        1             N
> 2        2             Y
> 2        3             N
> 2        4             N
> 2        5             N
>
> dataset2:
>
> ID    VISNO      Vary
> 1        0             N
> 1        3             N
> 1        4             Y
> 1        6             N
> 1        8             N
> 2        0             N
> 2        4             Y
> 2        6             N
> 2        8             N
> 2      10             N

I should have changed VarX to Var Y in dataset2. The two datasets do
not have variables in common.

I want the final dataset to look something like this

ID VISNO Varx Vary
1 0 . N
1 1 N .
1 2 N .
1 3 Y N
1 4 N Y
1 5 N .
1 6 . N
1 8 . N
From: Arthur Tabachneck on
jiji,

Given the sample data you provided, what do you want the resulting
file to look like? There are many possibilities and, for both ids,
visno 4 has conflicting values for varx.

Art
-------------
On Mar 7, 10:56 am, jiji <joelleahal...(a)gmail.com> wrote:
> Hi,
>
> I am having difficulty with a many-to-many merge between two
> datasets.
> The two datasets actually have 2 variables in common (ID and VISNO),
> but I want to merge on ID only since I would lose some data from the
> first dataset if I merge with ID and VISNO. Here are both my merge and
> procSQL codes:
>
> proc sort data=dataset1; by ID;
> proc sort data=dataset2; by ID;
> Data want;
> Merge
> dataset1 (IN=in1)
> dataset2;
> By ID;
> if in1;
> Run;
>
> OR
>
> PROC SQL;
> CREATE TABLE want AS
> SELECT *
> FROM dataset1,dataset2
> WHERE dataset1.ID=dataset2.ID
> ;
> QUIT;
>
> Here is a snapshot of how the data looks:
> dataset1:
>
> ID VISNO Varx
> 1 1 N
> 1 2 N
> 1 3 Y
> 1 4 N
> 1 5 N
> 2 1 N
> 2 2 Y
> 2 3 N
> 2 4 N
> 2 5 N
>
> dataset2:
>
> ID VISNO Varx
> 1 0 N
> 1 3 N
> 1 4 Y
> 1 6 N
> 1 8 N
> 2 0 N
> 2 4 Y
> 2 6 N
> 2 8 N
> 2 10 N
From: jiji on
On Mar 7, 10:39 am, art...(a)NETSCAPE.NET (Arthur Tabachneck) wrote:
> jiji,
>
> Given the sample data you provided, what do you want the resulting
> file to look like?  There are many possibilities and, for both ids,
> visno 4 has conflicting values for varx.
>
> Art
> -------------
> On Mar 7, 10:56 am, jiji <joelleahal...(a)gmail.com> wrote:
>
> > Hi,
>
> > I am having difficulty with a many-to-many merge between two
> > datasets.
> > The two datasets actually have 2 variables in common (ID and VISNO),
> > but I want to merge on ID only since I would lose some data from the
> > first dataset if I merge with ID and VISNO. Here are both my merge and
> > procSQL codes:
>
> > proc sort data=dataset1; by ID;
> > proc sort data=dataset2; by ID;
> > Data want;
> > Merge
> > dataset1 (IN=in1)
> > dataset2;
> > By ID;
> > if in1;
> > Run;
>
> > OR
>
> > PROC SQL;
> > CREATE TABLE want AS
> > SELECT *
> > FROM dataset1,dataset2
> > WHERE dataset1.ID=dataset2.ID
> > ;
> > QUIT;
>
> > Here is a snapshot of how the data looks:
> > dataset1:
>
> > ID    VISNO      Varx
> > 1        1             N
> > 1        2             N
> > 1        3             Y
> > 1        4             N
> > 1        5             N
> > 2        1             N
> > 2        2             Y
> > 2        3             N
> > 2        4             N
> > 2        5             N
>
> > dataset2:
>
> > ID    VISNO      Varx
> > 1        0             N
> > 1        3             N
> > 1        4             Y
> > 1        6             N
> > 1        8             N
> > 2        0             N
> > 2        4             Y
> > 2        6             N
> > 2        8             N
> > 2      10             N

Now I see it in the discussion board:
I should have changed VarX to Var Y in dataset2. The two datasets do
not have variables in common.

I want the final dataset to look something like this
ID VISNO Varx Vary
1 0 . N
1 1 N .
1 2 N .
1 3 Y N
1 4 N Y
1 5 N .
1 6 . N
1 8 . N

I also tried running using the if first.ID and if last.ID commands in
the merge statement. That did not work either.
Thanks much,
From: jiji on
On Mar 7, 11:42 am, jiji <joelleahal...(a)gmail.com> wrote:
> On Mar 7, 10:39 am, art...(a)NETSCAPE.NET (Arthur Tabachneck) wrote:
>
>
>
>
>
> > jiji,
>
> > Given the sample data you provided, what do you want the resulting
> > file to look like?  There are many possibilities and, for both ids,
> > visno 4 has conflicting values for varx.
>
> > Art
> > -------------
> > On Mar 7, 10:56 am, jiji <joelleahal...(a)gmail.com> wrote:
>
> > > Hi,
>
> > > I am having difficulty with a many-to-many merge between two
> > > datasets.
> > > The two datasets actually have 2 variables in common (ID and VISNO),
> > > but I want to merge on ID only since I would lose some data from the
> > > first dataset if I merge with ID and VISNO. Here are both my merge and
> > > procSQL codes:
>
> > > proc sort data=dataset1; by ID;
> > > proc sort data=dataset2; by ID;
> > > Data want;
> > > Merge
> > > dataset1 (IN=in1)
> > > dataset2;
> > > By ID;
> > > if in1;
> > > Run;
>
> > > OR
>
> > > PROC SQL;
> > > CREATE TABLE want AS
> > > SELECT *
> > > FROM dataset1,dataset2
> > > WHERE dataset1.ID=dataset2.ID
> > > ;
> > > QUIT;
>
> > > Here is a snapshot of how the data looks:
> > > dataset1:
>
> > > ID    VISNO      Varx
> > > 1        1             N
> > > 1        2             N
> > > 1        3             Y
> > > 1        4             N
> > > 1        5             N
> > > 2        1             N
> > > 2        2             Y
> > > 2        3             N
> > > 2        4             N
> > > 2        5             N
>
> > > dataset2:
>
> > > ID    VISNO      Varx
> > > 1        0             N
> > > 1        3             N
> > > 1        4             Y
> > > 1        6             N
> > > 1        8             N
> > > 2        0             N
> > > 2        4             Y
> > > 2        6             N
> > > 2        8             N
> > > 2      10             N
>
> Now I see it in the discussion board:
> I should have changed VarX to Var Y in dataset2. The two datasets do
> not have variables in common.
>
> I want the final dataset to look something like this
> ID    VISNO     Varx      Vary
> 1        0             .           N
> 1        1             N          .
> 1        2             N          .
> 1        3             Y         N
> 1        4             N         Y
> 1        5             N         .
> 1        6             .          N
> 1        8             .          N
>
> I also tried running using the if first.ID and if last.ID commands in
> the merge statement. That did not work either.
> Thanks much,- Hide quoted text -
>
> - Show quoted text -

Thank you Arthur,
I actually did try that. and the data merged however I lost data fron
all the visits from the first dataset.
when I merge with ID and VISNO
the data looked like this:

ID VISNO Varx Vary
1 3 N N
1 4 Y Y

I want to replace the visits that do not match between the two
datasets with missing points so that I can then recode as needed.
What do you think?

Thanks!!