Very large point estimates in Logistic Regression

  • Follow


Hi

  I have some very strong linear independent variables that are not
collinear, all tolerances>.8. Also note that all the coeficients are
positive, as expected in the real world and have small coef of
variation (and standard error). It seems to me that the pint estimates
of the odds ratio tend to be large when you have powerful independent
variables which can nicely segment the binary response.

Should I be overly concerned about the point estimates and would
better bucketing and dummification help.
I had a 50% hold out sample(75,000 observations) and the model results
where indistinguishable from the development model. Also I have had
good results  applying the model to similar popualtions.

At least we have lower bounds on the point estimates.


                               Standard       Wald
Parameter          DF Estimate    Error Chi-Square Pr > ChiSq

Intercept              1  -7.1677   0.1343  2847.2526     <.0001
AGE                   1   7.8295   0.6147   162.2461     <.0001
GENDER               1   7.8149   0.9116    73.4927     <.0001
INC_TOTAL            1   5.9408   0.4784   154.2415     <.0001
SCHOOL               1   6.2133   0.5420   131.4006     <.0001
EQUITY                 1   7.3206   0.5799   159.3500     <.0001
IVY_LEAGUE         1   3.9864   0.2986   178.2598     <.0001
STOCKS             1   7.2808   0.2462   874.5103     <.0001


                 Odds Ratio Estimates

                    Point          95% Wald
Effect            Estimate      Confidence Limits
AGE
GENDER            >999.999     753.493    >999.999
INC_TOTAL         >999.999     414.983    >999.999
SCHOOL             380.251     148.901     971.053
EQUITY             499.333     172.590    >999.999
IVY_LEAGUE        >999.999     484.899    >999.999
STOCKS              53.858      29.999      96.693
                  >999.999     896.237    >999.999
0
Reply xlr82sas 2/8/2011 10:13:51 PM

> =A0 I have some very strong linear independent variables that are not
> collinear, all tolerances>.8. Also note that all the coeficients are
> positive, as expected in the real world and have small coef of
> variation (and standard error). It seems to me that the pint estimates
> of the odds ratio tend to be large when you have powerful independent
> variables which can nicely segment the binary response.

Is this a case of complete or quasi-complete separation? In simple
terms, the model is predicting 'too well'
0
Reply BruceBrad 2/11/2011 2:15:45 AM


1 Replies
480 Views

(page loaded in 0.083 seconds)

Similiar Articles:













7/21/2012 4:46:23 AM


Reply: