From: Steven Raimi on 2 Dec 2009 09:09 I have developed 600+ potential predictors for use in a logistic regression model I'm working on. I want to screen each as efficiently as possible for predictive power (using the c-statistic). We have a brute- force method to generate the c-statistics (proc logistic on yvar=xvar_in_question, then numerically integrate the ROC curve to estimate), but there has to be a more straightforward (and efficient) way to perform this task, right? Also, I want to identify variables/groups of variables that are collinear, so I can leave out all but the most sensible one(s) (per subject matter knowledge). I could use PROC CORR, but that will be overwhelmed trying to do 600*600 combinations. Again, isn't there a better way to attack this? FYI - I have both SAS and JMP available. Only about 5% of the dataset can fit in JMP - but we'll be developing the regression there (using all target outcomes, and a few percent of the other records so there's a minimum of two non-target records per target one). Thanks for the guidance! Steve
|
Pages: 1 Prev: Sas/af multiple select list box problem Next: Proc GLMSELECT |