From: Dale McLerran on 10 Nov 2009 14:20 --- On Tue, 11/10/09, Bijay Tamang <tamangbijay(a)GMAIL.COM> wrote: > From: Bijay Tamang <tamangbijay(a)GMAIL.COM> > Subject: Zero-inflated Poisson for correlated data > To: SAS-L(a)LISTSERV.UGA.EDU > Date: Tuesday, November 10, 2009, 8:44 AM > I am trying to analyze my windbreak > root data using ZIP model. > > I have three windbreaks and two trenches within each > windbreak. 1mx1m trench > face was divided into 10 cm x 10cm grids. I counted the > number of > roots(Nxtot) exiting from each grid and 60 percent of my > count is zero. > > Here is what I did > > 1. Sorted the data by windbreak (WBA) , Location (trench), > depth > 2. WBA is a categorical variable, so I created dummy > variables > 3. Created a new random variable as follows as NLMIXED > allows only one > random variable > > wba_loc_dep = COMPRESS(wba||"-"||location||"-"||depth); > wba_loc = COMPRESS(wba||"-"||location); > > 1. The following code runs fine but does not have the > random statement but > the "inflation probability is only 0.05" > > proc nlmixed data=Nx3; > parms a0=0 a11=0 a12=0 a2=0 a3=0 b0=0 b11=0 b12=0 b2=0 > b3=0; > logitpi=a0 + a11*WB_dum1 + a12*wb_dum2 + a2*depth1 + > a3*BD; > pi=exp(logitpi)/(1+exp(logitpi)); > loglambda=b0 + b11*WB_dum1 + b12*wb_dum2 + b2*depth1 > + b3*BD; > lambda=exp(loglambda); > if nxtot=0 then > ll=log(pi+(1-pi)*exp(-lambda)); > else ll=log(1-pi) + > Nxtot*log(lambda)-lambda-lgamma(Nxtot+1); > model Nxtot~general (ll); > ESTIMATE "inflation probability" pi ; > ESTIMATE "lambda" lambda ; > predict pi out=ZIP_pi_preds; > predict lambda out = ZIP_lambda_preds; > run; > > 2. Now, I added the random statement in the following code > but gives me the > following message " The final Hessian matrix is not > positive definite, and > therefore the estimated > covariance matrix is > not full rank and may be unreliable. The > variance of some > parameter estimates > is zero or some parameters are linearly related > to other > parameters. > > and "inflation probability is 168E-31" > > PROC SORT DATA=nx3; > BY wba location depth pos; > TITLE 'ZIP model with RANDOM effect for positions nested > in > location*windbreak*depth'; > proc nlmixed data=Nx3; > parms a0=4.3 a11=-2.8 a12=-2.8 a2=1.4 a3=-4.3 b0=0.91 > b11=0.77 b12=1.1 > s2u=4; > logitpi=a0 + a11*WB_dum1 + a12*wb_dum2 + a2*depth1 + > a3*BD; > pi=exp(logitpi)/(1+exp(logitpi)); > loglambda=b0 + b11*WB_dum1 + b12*wb_dum2 + u; > lambda=exp(loglambda); > if nxtot=0 then ll=log(pi+(1-pi)*exp(-lambda)); > else ll=log(1-pi) + > Nxtot*log(lambda)-lambda-lgamma(Nxtot+1); > model Nxtot~general (ll); > RANDOM u ~ normal(0,s2u) SUBJECT=wba_loc_dep OUT=out_Rand; > ESTIMATE "inflation probability" pi ; > predict pi out=ZIP_pi_preds; > predict lambda out = ZIP_lambda_preds; > run; > > 3. If replace "SUBJECT=wba_loc_dep" in the above code > with > "SUBJECT=wba_loc" the code runs fine and the "inflation > probability is > 0.017" > > I believe the inflation probability should be closer to 60. > Am I doing the > right analysis? Is this the right way to analyze my data? > > thanks in Advance. Bijay > Bijay, Random subjects should be independent of each other. Using wba_loc_dep as your random subject certainly violate this assumption. You would expect the number of roots in the first 10 cm of trench 1/location 1 to be correlated with the number of roots in the second 10 cm, third 10 cm, ... of trench 1/location 1. In truth, your random subjects should probably be specified as location, with trench within location a random effect. However, a case might be made for specifying the trench as independent units. Given the limited number of locations in your data, this is all the more reasonable. Now, whether the trench effect on log(lambda) is the same at all depths might be in question. It would be possible to fit a model in which trench was the random subject and the between-trench variance in log(lambda) depended on depth. You would have to include 10 random effects (u1, u2, u3, ..., u10), with each of these random effects associated with a specific depth. You would also have to allow for covariance between the u{i} effect estimates. Given the small number of trenches in your investigation, I don't think that you should try to model variation (and covariance) in log(lambda) associated with trench depth. You have only 6 trenches, and the covariance structure would have 55 elements (although some might be constrained to be the same). As for what you regard as a low zero-inflation probability, remember that the zero-inflation probability is the excess zero count after the number of zero values determined by the Poisson distribution are accounted for. That is, the Poisson distribution itself would lead one to expect a certain frequency of zero values. The zero-inflation probability represents zero values above the number that are expected under the Poisson distribution. In short, your third model is not unreasonable and is probably the best that you can do with the limited number of trenches which you have for your study. Dale --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra(a)NO_SPAMfhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 ---------------------------------------
|
Pages: 1 Prev: combinations problem Next: Getting Script Full File Path using |