From: Robin R High on 21 Dec 2009 14:18 Robert, Correlation depends on the discordant proportions of the _outcomes_ (not the matching) in the off-diagonal cell percents, p10 and p01, of the following table. Mother 1 (control) Yes=1 No=0 (mortality) ------------- Mother 2 Yes=1 | p11 | p10 | .010=pt 1% of mothers in trtmnt group experience mortality (trtmnt) |-----------| No=0 | p01 | p00 | .990 99% of mother in trtmnt group do not experience mortality ------------- .015 .985 1.00 =pc 1.5% of mothers in control group experience mortality 98.5% of mother in control group do not experience mortality The following formula connects the cell probabilities with the marginal values, pt and pc: corr = (p11*p00 - p10*p01)/SQRT((pc*(1-pc)*pt*(1-pt))); Also note that these two differences are the same: pt-pc = p10-p01 Through some manipulation of equations, and PROC MODEL to solve, one find values for various scenarios, e.g., assuming 10000 pairs, what are the resulting cell counts and probabilities for various correlations (.05, ..25, and .5) such that the marginal values are the same: correlation=.05 Frequency| Percent | 1| 2| Total ---------+--------+--------+ 1 | 8 | 92 | 100 | 0.08 | 0.92 | 1.00 ---------+--------+--------+ 2 | 142 | 9758 | 9900 | 1.42 | 97.58 | 99.00 ---------+--------+--------+ Total 150 9850 10000 1.50 98.50 100.00 correlation=.25 Frequency| Percent | 1| 2| Total ---------+--------+--------+ 1 | 32 | 68 | 100 | 0.32 | 0.68 | 1.00 ---------+--------+--------+ 2 | 118 | 9782 | 9900 | 1.18 | 97.82 | 99.00 ---------+--------+--------+ Total 150 9850 10000 1.50 98.50 100.00 correlation=.5 Frequency| Percent | 1| 2| Total ---------+--------+--------+ 1 | 62 | 38 | 100 | 0.62 | 0.38 | 1.00 ---------+--------+--------+ 2 | 88 | 9812 | 9900 | 0.88 | 98.12 | 99.00 ---------+--------+--------+ Total 150 9850 10000 1.50 98.50 100.00 To determine a correlation, make a 2x2 table of counts (like the one above for corr=.50) that is your best guess of mortality and then run the counts through proc freq: DATA one; input i j count; cards; 1 1 62 1 2 38 2 1 88 2 2 9812 ; proc freq; table i*j / measures; weight count; run; produces: Pearson Correlation = 0.5002 Spearman Correlation = 0.5002 I hesitate to say "details are left to the reader", but I recently worked through this interesting problem in a similar project. It also helps to assume larger proportions (e.g., 35/100 vs 30/100) than you are looking at to make further sense of this. And there are some interesting connections here between matched pairs (depending on the correlation) and independent samples. Also, read chapter 3 of Paul Allison's SAS book on "Fixed Effects" for other approaches to the McNemar test, esp when you have explanatory variables. Robin High UNMC From: Robert Feyerharm <robertf(a)HEALTH.OK.GOV> To: SAS-L(a)LISTSERV.UGA.EDU Date: 12/18/2009 03:57 PM Subject: proc power question for McNemar test Sent by: "SAS(r) Discussion" <SAS-L(a)LISTSERV.UGA.EDU> I have a question regarding the power procedure for a paired case-control design using the McNemar test. I'm using proc power to estimate the necessary sample size for a proposed public health study that will compare infant mortality rates between two groups, a control group of mothers who received no public health intervention & a treatment group who participated in the Children First or Healthy Start programs. Mothers will be matched based on similar demographic variables (race, age, education, etc.). We want to detect a reduction in infant mortality from say 15 deaths per 1,000 live births (p0=.015) to 10 deaths per 1,000 live births (p1=.010), with power=.80 and alpha=.05. Here's my code: proc power; pairedfreq dist=normal method=connor test=mcnemar corr=??? alpha=.05 relativerisk = .67 refproportion = 0.015 npairs = . power = .8; run; My question: What is the correct value to use for the correlation coefficient for exposure between cases and their matched controls? Since every matched pair will be discordant (the case mother participates in the health program & her control doesn't), is corr=0 appropriate? Thanks in advance! Robert Feyerharm Oklahoma State Department of Health
From: Jeff on 21 Dec 2009 19:42 On Dec 21, 2:18 pm, rh...(a)UNMC.EDU (Robin R High) wrote: > Robert, > > Correlation depends on the discordant proportions of the _outcomes_ (not > the matching) in the off-diagonal cell percents, p10 and p01, of the > following table. > > Mother 1 (control) > Yes=1 No=0 (mortality) > ------------- > Mother 2 Yes=1 | p11 | p10 | .010=pt 1% of mothers in trtmnt group > experience mortality > (trtmnt) |-----------| > No=0 | p01 | p00 | .990 99% of mother in trtmnt group do not > experience mortality > ------------- > .015 .985 1.00 > =pc > > 1.5% of mothers in control group experience mortality > 98.5% of mother in control group do not experience > mortality > > The following formula connects the cell probabilities with the marginal > values, pt and pc: > > corr = (p11*p00 - p10*p01)/SQRT((pc*(1-pc)*pt*(1-pt))); > > Also note that these two differences are the same: > > pt-pc = p10-p01 > > Through some manipulation of equations, and PROC MODEL to solve, one find > values for various scenarios, e.g., assuming 10000 pairs, what are the > resulting cell counts and probabilities for various correlations (.05, > .25, and .5) such that the marginal values are the same: > > correlation=.05 > > Frequency| > Percent | 1| 2| Total > ---------+--------+--------+ > 1 | 8 | 92 | 100 > | 0.08 | 0.92 | 1.00 > ---------+--------+--------+ > 2 | 142 | 9758 | 9900 > | 1.42 | 97.58 | 99.00 > ---------+--------+--------+ > Total 150 9850 10000 > 1.50 98.50 100.00 > > correlation=.25 > > Frequency| > Percent | 1| 2| Total > ---------+--------+--------+ > 1 | 32 | 68 | 100 > | 0.32 | 0.68 | 1.00 > ---------+--------+--------+ > 2 | 118 | 9782 | 9900 > | 1.18 | 97.82 | 99.00 > ---------+--------+--------+ > Total 150 9850 10000 > 1.50 98.50 100.00 > > correlation=.5 > > Frequency| > Percent | 1| 2| Total > ---------+--------+--------+ > 1 | 62 | 38 | 100 > | 0.62 | 0.38 | 1.00 > ---------+--------+--------+ > 2 | 88 | 9812 | 9900 > | 0.88 | 98.12 | 99.00 > ---------+--------+--------+ > Total 150 9850 10000 > 1.50 98.50 100.00 > > To determine a correlation, make a 2x2 table of counts (like the one above > for corr=.50) that is your best guess of mortality and then run the counts > through proc freq: > > DATA one; > input i j count; > cards; > 1 1 62 > 1 2 38 > 2 1 88 > 2 2 9812 > ; > > proc freq; > table i*j / measures; > weight count; > run; > > produces: > Pearson Correlation = 0.5002 > Spearman Correlation = 0.5002 > > I hesitate to say "details are left to the reader", but I recently worked > through this interesting problem in a similar project. It also helps to > assume larger proportions (e.g., 35/100 vs 30/100) than you are looking at > to make further sense of this. And there are some interesting connections > here between matched pairs (depending on the correlation) and independent > samples. Also, read chapter 3 of Paul Allison's SAS book on "Fixed > Effects" for other approaches to the McNemar test, esp when you have > explanatory variables. > > Robin High > UNMC > > From: > Robert Feyerharm <robe...(a)HEALTH.OK.GOV> > To: > SA...(a)LISTSERV.UGA.EDU > Date: > 12/18/2009 03:57 PM > Subject: > proc power question for McNemar test > Sent by: > "SAS(r) Discussion" <SA...(a)LISTSERV.UGA.EDU> > > I have a question regarding the power procedure for a paired case-control > design using the McNemar test. > > I'm using proc power to estimate the necessary sample size for a proposed > public health study that will compare infant mortality rates between two > groups, a control group of mothers who received no public health > intervention & a treatment group who participated in the Children First or > Healthy Start programs. Mothers will be matched based on similar > demographic variables (race, age, education, etc.). We want to detect a > reduction in infant mortality from say 15 deaths per 1,000 live births > (p0=.015) to 10 deaths per 1,000 live births (p1=.010), with power=..80 and > alpha=.05. > > Here's my code: > > proc power; > pairedfreq dist=normal method=connor > test=mcnemar > corr=??? > alpha=.05 > relativerisk = .67 > refproportion = 0.015 > npairs = . > power = .8; > run; > > My question: What is the correct value to use for the correlation > coefficient for exposure between cases and their matched controls? Since > every matched pair will be discordant (the case mother participates in the > health program & her control doesn't), is corr=0 appropriate? > > Thanks in advance! > > Robert Feyerharm > Oklahoma State Department of Health FYI: Page 29 of Stokes et al. Categorical Data Analysis Using The SAS System has this and a couple other equivalent formulas.
From: "Feyerharm, Robert W." on 23 Dec 2009 12:17 Thanks Robin, this is very helpful! Robert -----Original Message----- From: Robin R High [mailto:rhigh(a)unmc.edu] Sent: Monday, December 21, 2009 1:19 PM To: Feyerharm, Robert W. Cc: SAS-L(a)LISTSERV.UGA.EDU Subject: Re: proc power question for McNemar test Robert, Correlation depends on the discordant proportions of the _outcomes_ (not the matching) in the off-diagonal cell percents, p10 and p01, of the following table. Mother 1 (control) Yes=1 No=0 (mortality) ------------- Mother 2 Yes=1 | p11 | p10 | .010=pt 1% of mothers in trtmnt group experience mortality (trtmnt) |-----------| No=0 | p01 | p00 | .990 99% of mother in trtmnt group do not experience mortality ------------- .015 .985 1.00 =pc 1.5% of mothers in control group experience mortality 98.5% of mother in control group do not experience mortality The following formula connects the cell probabilities with the marginal values, pt and pc: corr = (p11*p00 - p10*p01)/SQRT((pc*(1-pc)*pt*(1-pt))); Also note that these two differences are the same: pt-pc = p10-p01 Through some manipulation of equations, and PROC MODEL to solve, one find values for various scenarios, e.g., assuming 10000 pairs, what are the resulting cell counts and probabilities for various correlations (.05, ..25, and .5) such that the marginal values are the same: correlation=.05 Frequency| Percent | 1| 2| Total ---------+--------+--------+ 1 | 8 | 92 | 100 | 0.08 | 0.92 | 1.00 ---------+--------+--------+ 2 | 142 | 9758 | 9900 | 1.42 | 97.58 | 99.00 ---------+--------+--------+ Total 150 9850 10000 1.50 98.50 100.00 correlation=.25 Frequency| Percent | 1| 2| Total ---------+--------+--------+ 1 | 32 | 68 | 100 | 0.32 | 0.68 | 1.00 ---------+--------+--------+ 2 | 118 | 9782 | 9900 | 1.18 | 97.82 | 99.00 ---------+--------+--------+ Total 150 9850 10000 1.50 98.50 100.00 correlation=.5 Frequency| Percent | 1| 2| Total ---------+--------+--------+ 1 | 62 | 38 | 100 | 0.62 | 0.38 | 1.00 ---------+--------+--------+ 2 | 88 | 9812 | 9900 | 0.88 | 98.12 | 99.00 ---------+--------+--------+ Total 150 9850 10000 1.50 98.50 100.00 To determine a correlation, make a 2x2 table of counts (like the one above for corr=.50) that is your best guess of mortality and then run the counts through proc freq: DATA one; input i j count; cards; 1 1 62 1 2 38 2 1 88 2 2 9812 ; proc freq; table i*j / measures; weight count; run; produces: Pearson Correlation = 0.5002 Spearman Correlation = 0.5002 I hesitate to say "details are left to the reader", but I recently worked through this interesting problem in a similar project. It also helps to assume larger proportions (e.g., 35/100 vs 30/100) than you are looking at to make further sense of this. And there are some interesting connections here between matched pairs (depending on the correlation) and independent samples. Also, read chapter 3 of Paul Allison's SAS book on "Fixed Effects" for other approaches to the McNemar test, esp when you have explanatory variables. Robin High UNMC From: Robert Feyerharm <robertf(a)HEALTH.OK.GOV> To: SAS-L(a)LISTSERV.UGA.EDU Date: 12/18/2009 03:57 PM Subject: proc power question for McNemar test Sent by: "SAS(r) Discussion" <SAS-L(a)LISTSERV.UGA.EDU> I have a question regarding the power procedure for a paired case-control design using the McNemar test. I'm using proc power to estimate the necessary sample size for a proposed public health study that will compare infant mortality rates between two groups, a control group of mothers who received no public health intervention & a treatment group who participated in the Children First or Healthy Start programs. Mothers will be matched based on similar demographic variables (race, age, education, etc.). We want to detect a reduction in infant mortality from say 15 deaths per 1,000 live births (p0=.015) to 10 deaths per 1,000 live births (p1=.010), with power=.80 and alpha=.05. Here's my code: proc power; pairedfreq dist=normal method=connor test=mcnemar corr=??? alpha=.05 relativerisk = .67 refproportion = 0.015 npairs = . power = .8; run; My question: What is the correct value to use for the correlation coefficient for exposure between cases and their matched controls? Since every matched pair will be discordant (the case mother participates in the health program & her control doesn't), is corr=0 appropriate? Thanks in advance! Robert Feyerharm Oklahoma State Department of Health
|
Pages: 1 Prev: 00 Replace Missing Values withing Medians from each sub-groups Next: data step |