Omnibus Tests in Logistic Regression Omnibus test




1 omnibus tests in logistic regression

1.1 omnibus test relates hypotheses
1.2 model fitting: maximum likelihood method
1.3 test s statistic , distribution: wilks theorem
1.4 remarks , other considerations
1.5 example 1 of logistic regression

1.5.1 independent variables
1.5.2 dependent variable

1.5.2.1 block 1: method = forward stepwise (conditional)
1.5.2.2 omnibus tests of model coefficients
1.5.2.3 block 2: method = forward stepwise (conditional)
1.5.2.4 omnibus tests of model coefficients
1.5.2.5 variables in equation




1.6 example 2 of logistic regression

1.6.1 dependent variable (coded dummy variable)
1.6.2 independent variables (coded dummy variables)
1.6.3 omnibus tests of model coefficients
1.6.4 variables in equation







omnibus tests in logistic regression

in statistics, logistic regression type of regression analysis used predicting outcome of categorical dependant variable (with limited number of categories) or dichotomic dependant variable based on 1 or more predictor variables. probabilities describing possible outcome of single trial modeled, function of explanatory ( independent ) variables, using logistic function or multinomial distribution. logistic regression measures relationship between categorical or dichotomic dependent variable , continuous independent variable (or several), converting dependent variable probability scores. probabilities can retrieved using logistic function or multinomial distribution, while probabilities, in probability theory, takes on values between 0 , one:






p
(

y

i


)
=



e


β

0


+

β

1



x

i
1


+

+

β

k



x

i
k





1
+

e


β

0


+

β

1



x

i
1


+

+

β

k



x

i
k







=


1

1
+

e


(

β

0


+

β

1



x

i
1


+

+

β

k



x

i
k


)







{\displaystyle p(y_{i})={\frac {e^{\beta _{0}+\beta _{1}x_{i1}+\cdot +\beta _{k}x_{ik}}}{1+e^{\beta _{0}+\beta _{1}x_{i1}+\cdot +\beta _{k}x_{ik}}}}={\frac {1}{1+e^{-(\beta _{0}+\beta _{1}x_{i1}+\cdot +\beta _{k}x_{ik})}}}}


so model tested can defined by:






f
(

y

i


)
=
l
n



p
(

y

i


)


1

p
(

y

i


)



=

β

0


+

β

1



x

i
1


+

+

β

k



x

i
k




{\displaystyle f(y_{i})=ln{\frac {p(y_{i})}{1-p(y_{i})}}=\beta _{0}+\beta _{1}x_{i1}+\cdot +\beta _{k}x_{ik}}


,whereas yi category of dependant variable i-th observation , xij j independent variable (j=1,2,...k) observation, βj j-th coefficient of xij , indicates influence on , expected fitted model .


note: independent variables in logistic regression can continuous.


the omnibus test relates hypotheses

h0: β1= β2=….= βk


h1: @ least 1 pair βj≠ βj


model fitting: maximum likelihood method

the omnibus test, among other parts of logistic regression procedure, likelihood-ratio test based on maximum likelihood method. unlike linear regression procedure in estimation of regression coefficients can derived least square procedure or minimizing sum of squared residuals in maximum likelihood method, in logistic regression there no such analytical solution or set of equations 1 can derive solution estimate regression coefficients. logistic regression uses maximum likelihood procedure estimate coefficients maximize likelihood of regression coefficients given predictors , criterion.[6] maximum likelihood solution iterative process begins tentative solution, revises see if can improved, , repeats process until improvement minute, @ point model said have converged.[6]. applying procedure in conditioned on convergence ( see in following remarks , other considerations ).


in general, regarding simple hypotheses on parameter θ ( example): h0: θ=θ0 vs. h1: θ=θ1 ,the likelihood ratio test statistic can referred as:



λ
(

y

i


)
=



l
(

y

i



|


θ

0


)


l
(

y

i



|


θ

1


)





{\displaystyle \lambda (y_{i})={\frac {l(y_{i}|\theta _{0})}{l(y_{i}|\theta _{1})}}}


,where l(yi|θ) likelihood function, refers specific θ.


the numerator corresponds maximum likelihood of observed outcome under null hypothesis. denominator corresponds maximum likelihood of observed outcome varying parameters on whole parameter space. numerator of ratio less denominator. likelihood ratio hence between 0 , 1.


lower values of likelihood ratio mean observed result less occur under null hypothesis compared alternative. higher values of statistic mean observed outcome more or equally or occur under null hypothesis compared alternative, , null hypothesis cannot rejected.


the likelihood ratio test provides following decision rule:


if 



λ
(

y

i


)
>
c


{\displaystyle \lambda (y_{i})>c}

  not reject h0,


otherwise


if  



λ
(

y

i


)
<
c


{\displaystyle \lambda (y_{i})<c}

  reject h0


and reject h0 probability   q   if  



λ
(

y

i


)
=
c


{\displaystyle \lambda (y_{i})=c}

,


whereas critical values   c, q   chosen obtain specified significance level α, through relation:



q

(
p
(
λ
(

y

i


)
=
c

|


h

0


)
+
(
p
(
λ
(

y

i


)
<
c

|


h

0


)


{\displaystyle q\cdot (p(\lambda (y_{i})=c|h_{0})+(p(\lambda (y_{i})<c|h_{0})}

.


thus, likelihood-ratio test rejects null hypothesis if value of statistic small. how small small depends on significance level of test, i.e., on probability of type error considered tolerable neyman-pearson lemma states likelihood ratio test powerful among level-α tests problem.


test s statistic , distribution: wilks theorem

first define test statistic deviate



d
=

2
l
n
λ
(

y

i


)


{\displaystyle d=-2ln\lambda (y_{i})}

indicates testing ratio:






d
=

2
l
n
λ
(

y

i


)
=

2
l
n



l
i
k
e
l
i
h
o
o
d
 
u
n
d
e
r
 
f
i
t
t
e
d
 
m
o
d
e
l
 
i
f
 
n
u
l
l
 
h
y
p
o
t
h
e
s
i
s
 
i
s
 
t
r
u
e


l
i
k
e
l
i
h
o
o
d
 
u
n
d
e
r
 
s
a
t
u
r
a
t
e
d
 
m
o
d
e
l
 





{\displaystyle d=-2ln\lambda (y_{i})=-2ln{\frac {likelihood\ under\ fitted\ model\ if\ null\ hypothesis\ is\ true}{likelihood\ under\ saturated\ model\ }}}


while saturated model model theoretically perfect fit. given deviance measure of difference between given model , saturated model, smaller values indicate better fit fitted model deviates less saturated model. when assessed upon chi-square distribution, non-significant chi-square values indicate little unexplained variance , thus, model fit. conversely, significant chi-square value indicates significant amount of variance unexplained. 2 measures of deviance d particularly important in logistic regression: null deviance , model deviance. null deviance represents difference between model intercept , no predictors , saturated model. and, model deviance represents difference between model @ least 1 predictor , saturated model.[3] in respect, null model provides baseline upon compare predictor models. therefore, assess contribution of predictor or set of predictors, 1 can subtract model deviance null deviance , assess difference on chi-square distribution 1 degree of freedom. if model deviance smaller null deviance 1 can conclude predictor or set of predictors improved model fit. analogous f-test used in linear regression analysis assess significance of prediction. in cases, exact distribution of likelihood ratio corresponding specific hypotheses difficult determine. convenient result, attributed samuel s. wilks, says sample size n approaches test statistic has asymptotically distribution degrees of freedom equal difference in dimensionality of , parameters β coefficients mentioned before on omnibus test. e.g., if n large enough , if fitted model assuming null hypothesis consist of 3 predictors , saturated ( full ) model consist of 5 predictors, wilks statistic approximately distributed ( 2 degrees of freedom). means can retrieve critical value c chi squared 2 degrees of freedom under specific significance level.


remarks , other considerations

example 1 of logistic regression

spector , mazzeo examined effect of teaching method known psi on performance of students in course, intermediate macro economics. question whether students exposed method scored higher on exams in class. collected data students in 2 classes, 1 in psi used , in traditional teaching method employed. each of 32 students, gathered data on


independent variables

• gpa-grade point average before taking class. • tuce-the score on exam given @ beginning of term test entering knowledge of material. • psi- dummy variable indicating teaching method used (1 = used psi, 0 = other method).


dependent variable

• grade — coded 1 if final grade a, 0 if final grade b or c.


the particular interest in research whether psi had significant effect on grade. tuce , gpa included control variables.


statistical analysis using logistic regression of grade on gpa, tuce , psi conducted in spss using stepwise logistic regression.


in output, block line relates chi-square test on set of independent variables tested , included in model fitting. step line relates chi-square test on step level while variables included in model step step. note in output step chi-square, same block chi-square since both testing same hypothesis tested variables enter on step non-zero. if doing stepwise regression, however, results different. using forward stepwise selection, researchers divided variables 2 blocks (see method on syntax following below).


logistic regression var=grade


/method=fstep psi / fstep gpa tuce


/criteria pin(.50) pout(.10) iterate(20) cut(.5).


the default pin value .05, changed researchers .5 insignificant tuce make in. in first block, psi alone gets entered, block , step chi test relates hypothesis h0: βpsi = 0. results of omnibus chi-square tests implies psi significant predicting grade more final grade of a.


block 1: method = forward stepwise (conditional)
omnibus tests of model coefficients

then, in next block, forward selection procedure causes gpa entered first, tuce (see method command on syntax before).


block 2: method = forward stepwise (conditional)
omnibus tests of model coefficients

the first step on block2 indicates gpa significant (p-value=0.003<0.05, α=0.05)


so, looking @ final entries on step2 in block2,



the step chi-square, .474, tells whether effect of variable entered in final step, tuce, differs zero. equivalent of incremental f test of parameter, i.e. tests h0: βtuce = 0.
the block chi-square, 9.562, tests whether either or both of variables included in block (gpa , tuce) have effects differ zero. equivalent of incremental f test, i.e. tests h0: βgpa = βtuce = 0.
the model chi-square, 15.404, tells whether of 3 independent variabls has significant effects. equivalent of global f test, i.e. tests h0: βgpa = βtuce = βpsi = 0.

tests of individual parameters shown on variables in equation table , wald test (w=(b/sb)2, b β estimation , sb standard error estimation ) testing whether individual parameter equals 0 . can, if want, incremental lr chi-square test. that, in fact, best way it, since wald test referred next biased under situations. when parameters tested separately, controlling other parameters, see effects of gpa , psi statistically significant, effect of tuce not. both have exp(β) greater 1, implying probability grade greater getting other grade depends upon teaching method psi , former grade average gpa.


variables in equation

a. variable(s) entered on step 1: psi


example 2 of logistic regression

research subject: “the effects of employment, education, rehabilitation , seriousness of offense on re-arrest” [8]. social worker in criminal justice probation agency, tends examine whether of factors leading re-arrest of managed agency on past 5 years convicted , released. data consist of 1,000 clients following variables:


dependent variable (coded dummy variable)

• re-arrested vs. not re-arrested (0 = not re-arrested; 1 = re-arrested) – categorical, nominal


independent variables (coded dummy variables)

whether or not client adjudicated second criminal offense (1= adjudicated,0=not).
seriousness of first offense (1=felony vs. 0=misdemeanor) -categorical, nominal
high school graduate vs. not (0 = not graduated; 1 = graduated) - categorical, nominal
whether or not client completed rehabilitation program after first offense,0 = no rehab completed; 1 = rehab completed)-categorical, nominal
employment status after first offense (0 = not employed; 1 = employed)

note: continuous independent variables not measured on scenario.


the null hypothesis overall model fit: overall model not predict re-arrest. or, independent variables group not related being re-arrested. (and independent variables: of separate independent variables not related likelihood of re-arrest).


the alternative hypothesis overall model fit: overall model predicts likelihood of re-arrest. (the meaning respectively independent variables: having committed felony (vs. misdemeanor), not completing high school, not completing rehab program, , being unemployed related likelihood of being re-arrested).


logistic regression applied data on spss, since dependent variable categorical (dichotomous) , researcher examine odd ratio of potentially being re-arrested vs. not expected re-arrested.


omnibus tests of model coefficients

the table above shows omnibus test of model coefficients based on chi-square test, implies overall model predictive of re-arrest (we’re concerned row three—“model”): (4 degrees of freedom) = 41.15, p < .001, , null can rejected. testing null model, or group of independent variables taken together, not predict likelihood of being re-arrested. result means model of expecting re-arrestment more suitable data.


variables in equation

as shown on variables in equation table below, can reject null b coefficients having committed felony, completing rehab program, , being employed equal zero—they statistically significant , predictive of re-arrest. education level, however, not found predictive of re-arrest. controlling other variables, having committed felony first offense increases odds of being re-arrested 33% (p = .046), compared having committed misdemeanor. completing rehab program , being employed after first offense decreases odds or re-arrest, each more 50% (p < .001). last column, exp(b) (taking b value calculating inverse natural log of b) indicates odds ratio: probability of event occurring, divided probability of event not occurring. exp(b) value on 1.0 signifies independent variable increases odds of dependent variable occurring. exp(b) under 1.0 signifies independent variable decreases odds of dependent variable occurring, depending on decoding mentioned on variables details before. negative b coefficient result in exp(b) less 1.0, , positive b coefficient result in exp(b) greater 1.0. statistical significance of each b tested wald chi-square—testing null b coefficient = 0 (the alternate hypothesis not = 0). p-values lower alpha significant, leading rejection of null. here, independent variables felony, rehab, employment, significant ( p-value<0.05. examining odds ratio of being re-arrested vs. not re-arrested, means examine odds ratio comparison of 2 groups (re-arrested = 1 in numerator, , re-arrested = 0 in denominator) felony group, compared baseline misdemeanor group. exp(b)=1.327 “felony” can indicates having committed felony vs. misdemeanor increases odds of re-arrest 33%. “rehab” can having completed rehab reduces likelihood (or odds) of being re-arrested 51%.








Comments

Popular posts from this blog

Types Raffinate

Biography Michał Vituška

Caf.C3.A9 Types of restaurant