From: Alexis Lelex on
Hi again,

I put out the lift but the curb looks like pretty the same as the roc one.
I wonder if we still can interprate odds ratios when predictivity of the
model looks bad ?
From: Murphy Choy on
Hi,

If you are doing data mining, it should be fine. For research, I would recommend you to try other models.

------Original Message------
From: Alexis Lelex
Sender: SAS(r) Discussion
To: SAS-L(a)LISTSERV.UGA.EDU
ReplyTo: Alexis Lelex
Subject: Re: Quality of logistic regression model
Sent: Oct 22, 2009 11:21 PM

Hi again,

I put out the lift but the curb looks like pretty the same as the roc one.
I wonder if we still can interprate odds ratios when predictivity of the
model looks bad ?


Sent from my BlackBerry Wireless Handheld

--
Regards,
Murphy Choy

Certified Advanced Programmer for SAS V9
Certified Basic Programmer for SAS V9
From: Sigurd Hermansen on
Alexis:
I understand your English far better than you would understand my French, so don't worry about language deficits. We have a greater challenge in understanding the more cryptic language of statistics.

The relatively small contributions of covariates to the fit of model to data (e.g, AIC for full model with covariates = 91616.461; without covariates = 83858.175) would worry me as well. The p-values for individual covariates don't mean much. Predictions of an outcome in a large sample will typically explain something. How well a model predicts in samples not used to estimated parameters of a model has much more importance. A c statistic value of 0.715 suggests some predictive value of the model within a sample, but may not carry over to another sample.

My 2008 SESUG papers on predictive modeling focus on the c statistic (area under a ROC) and what it fails to take into account. The statistics that you cite indicate that your model is predicting better than chance (not a high standard). I would recommend validating the model by applying it to a new sample, if you have one.
S

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Alexis Lelex
Sent: Thursday, October 22, 2009 8:53 AM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Quality of logistic regression model

Hi,

This is my first post here and my english is not well... so i'll do my
best to make me understand.
I'm modelling a logistic regression on more than 120 000 individuals, and
i get some very interesting results with my odds ratios, and all p-values
are <0,0001.
But some figures in the SAS output make look bad quality of the model:
high AIC and SC, low R2 and Tau-a, Hosmer and Lemeshow telling a lack of
fit...
Here's some part of my output:

Statistiques d'ajustement du modèle

Coordonnées Coordonnées � l'origine
Critères A l'origine Avec Covariables
AIC 91616.461 83858.175
SC 91626.215 84043.487
-2 Log L 91614.461 83820.175


R-Square 0.0595 Max-rescaled R-Square 0.1158


Association des probabilités prédites et des réponses observées

Percent Concordant 71.2 Somers' D 0.431
Percent Discordant 28.1 Gamma 0.434
Percent Tied 0.7 Tau-a 0.089
Pairs 1666460055 c 0.715


Test d'adéquation d'Hosmer et de Lemeshow

Khi 2 DF Pr > Khi 2

78.2312 8 <.0001


Is it possible to make the interpretation of the odds ratios, even though
there's a lack of fit and the model isn't predictive ?
In other words what conclusion can we take (or not) from a model like this
one ?

If someone can help me on this one it'll be really great !

Thanks

PS: by the way very good SAS forum, i learn a lots of things reading you
peoples !
From: Alexis Lelex on
Hi,

I'm working on job statistics, and was modelling the probability to switch
on another profession "family". So i'll try with another sample, but it's
quite difficult cause our data is limited in time of study.
I just find your papers on the c statistics, thanks a lot i'm sure this will
help me to increased my understanding.

Alexis
From: Alexis Lelex on
Hi,

I'm working on the probability of an unemployed person to switch to another
"family" profession in a specific interval of years.
So i was thinking of proceeding time model, with use of proc lifetest and
phreg, but the results of the second one will be quite the same of the
logistic i might think, won't it ?
First  |  Prev  |  Next  |  Last
Pages: 1 2 3
Prev: logistics model.
Next: data manipulation problem