Dear Moderators,
This question relates to Nlogit analysis. My DCE study explores user preferences for a mobile health app.
I have 4 attributes, training (Tr), typing (ty), monitoring (m) and health education (he), each with 3 levels.
Each participant answered 8 choice tasks with 3 alternatives, mobile app A, app B and neither.
When a participant chose neither option, they were then forced to select one option from app A or B, with same attribute-level combinations as in the original choice task.
So, I have two datasets; (1) combined dataset with responses both for conditional and unconditional choice tasks and (2) unconditional dataset.
My panel MMNL model for the combined dataset indicates that coefficients for all attribute-levels except Tr2 and Ty2 are statistically significant. Please see below.
|-> sample ;all $
|-> Nlogit
;lhs= choice,cset,alt
;choices= appA, appB, neither, appC, appD
;rpl
;fcn = tr2(n), tr3(n), ty2(n), ty3(n), m2(n), m3(n), he2(n), he3(n)
;pts=500 ;halton
;pds=Pan2
;model:
U(appA) = ASC_A + TR2*tr2 + TR3*tr3 + TY2*ty2 + TY3*ty3 + M2*m2 + M3*m3 + HE2*he2 + HE3*he3 /
U(appB) = ASC_B + TR2*tr2 + TR3*tr3 + TY2*ty2 + TY3*ty3 + M2*m2 + M3*m3 + HE2*he2 + HE3*he3 /
U(appC) = ASC_C + TR2*tr2 + TR3*tr3 + TY2*ty2 + TY3*ty3 + M2*m2 + M3*m3 + HE2*he2 + HE3*he3 /
U(appD) = TR2*tr2 + TR3*tr3 + TY2*ty2 + TY3*ty3 + M2*m2 + M3*m3 + HE2*he2 + HE3*he3
$
Normal exit: 31 iterations. Status=0, F= 2435.644
-----------------------------------------------------------------------------
Random Parameters Logit Model
Dependent variable CHOICE
Log likelihood function -2435.64402
Restricted log likelihood -4511.25447
Chi squared [ 19 d.f.] 4151.22089
Significance level .00000
McFadden Pseudo R-squared .4600961
Estimation based on N = 2803, K = 19
Inf.Cr.AIC = 4909.3 AIC/N = 1.751
Model estimated: Nov 19, 2023, 19:21:57
Constants only must be computed directly
Use NLOGIT ;...;RHS=ONE$
At start values -2650.0815 .0809******
Response data are given as ind. choices
Replications for simulated probs. = 500
Halton sequences used for simulations
RPL model with panel has 302 groups
Variable number of obs./group =PAN2
Number of obs.= 2803, skipped 0 obs
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
CHOICE| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Random parameters in utility functions
TR2| .00500 .09337 .05 .9573 -.17801 .18801
TR3| -.81923*** .13418 -6.11 .0000 -1.08223 -.55624
TY2| -.04009 .12429 -.32 .7470 -.28370 .20351
TY3| .40498*** .12948 3.13 .0018 .15121 .65876
M2| 1.10921*** .14963 7.41 .0000 .81594 1.40247
M3| 1.35348*** .18585 7.28 .0000 .98922 1.71775
HE2| .21667** .09269 2.34 .0194 .03499 .39834
HE3| .67798*** .11630 5.83 .0000 .45003 .90593
|Nonrandom parameters in utility functions
ASC_A| .18888 .18619 1.01 .3104 -.17606 .55381
ASC_B| -.08429 .18404 -.46 .6469 -.44502 .27643
ASC_C| .67803*** .15667 4.33 .0000 .37097 .98509
|Distns. of RPs. Std.Devs or limits of triangular
NsTR2| .74171*** .15390 4.82 .0000 .44009 1.04334
NsTR3| 1.47791*** .13593 10.87 .0000 1.21150 1.74433
NsTY2| .72191*** .12957 5.57 .0000 .46796 .97587
NsTY3| .99317*** .11326 8.77 .0000 .77119 1.21514
NsM2| 1.09848*** .10748 10.22 .0000 .88781 1.30914
NsM3| 1.83987*** .15884 11.58 .0000 1.52855 2.15120
NsHE2| .59425*** .17010 3.49 .0005 .26085 .92764
NsHE3| .90148*** .11224 8.03 .0000 .68150 1.12147
--------+--------------------------------------------------------------------
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
However, my LCM model for the combined dataset indicates two classes where the PrbCls1 is 0.0 and PrbCls2 is 1.0. Please see below.
|-> sample ;all $
|-> Nlogit
;lhs=choice,cset,alt
;choices= appA, appB, neither, appC, appD
;lcm
;pts=2
;pds=pan2
;model:
U(appA) = ASC_A + TR2*tr2 + TR3*tr3 + TY2*ty2 + TY3*ty3 + M2*m2 + M3*m3 + HE2*he2 + HE3*he3 /
U(appB) = ASC_B + TR2*tr2 + TR3*tr3 + TY2*ty2 + TY3*ty3 + M2*m2 + M3*m3 + HE2*he2 + HE3*he3 /
U(appC) = ASC_C + TR2*tr2 + TR3*tr3 + TY2*ty2 + TY3*ty3 + M2*m2 + M3*m3 + HE2*he2 + HE3*he3 /
U(appD) = TR2*tr2 + TR3*tr3 + TY2*ty2 + TY3*ty3 + M2*m2 + M3*m3 + HE2*he2 + HE3*he3
$
Normal exit: 5 iterations. Status=0, F= 2650.082
Line search at iteration 58 does not improve fn. Exiting optimization.
-----------------------------------------------------------------------------
Latent Class Logit Model
Dependent variable CHOICE
Log likelihood function -2399.17591
Restricted log likelihood -4511.25447
Chi squared [ 23 d.f.] 4224.15712
Significance level .00000
McFadden Pseudo R-squared .4681799
Estimation based on N = 2803, K = 23
Inf.Cr.AIC = 4844.4 AIC/N = 1.728
Model estimated: Nov 24, 2023, 09:27:00
Constants only must be computed directly
Use NLOGIT ;...;RHS=ONE$
At start values -2650.0370 .0947******
Response data are given as ind. choices
Number of latent classes = 2
Average Class Probabilities
.472 .528
LCM model with panel has 302 groups
Variable number of obs./group =PAN2
Number of obs.= 2803, skipped 0 obs
--------+--------------------------------------------------------------------
| Standard Prob. 95% Confidence
CHOICE| Coefficient Error z |z|>Z* Interval
--------+--------------------------------------------------------------------
|Utility parameters in latent class -->> 1
ASC_A|1| .70351*** .24393 2.88 .0039 .22542 1.18160
TR2|1| .03834 .07570 .51 .6125 -.11003 .18671
TR3|1| -.45627*** .07965 -5.73 .0000 -.61239 -.30016
TY2|1| .47226*** .12344 3.83 .0001 .23033 .71420
TY3|1| .88861*** .13413 6.62 .0000 .62571 1.15150
M2|1| 1.52117*** .15935 9.55 .0000 1.20885 1.83348
M3|1| 2.00652*** .18045 11.12 .0000 1.65283 2.36020
HE2|1| .33145*** .07495 4.42 .0000 .18456 .47835
HE3|1| .65084*** .08541 7.62 .0000 .48345 .81824
ASC_B|1| .49572** .24586 2.02 .0438 .01384 .97759
ASC_C|1| .08365 .28282 .30 .7674 -.47066 .63796
|Utility parameters in latent class -->> 2
ASC_A|2| -.16553 .25222 -.66 .5116 -.65987 .32880
TR2|2| -.11543 .11546 -1.00 .3174 -.34172 .11086
TR3|2| -.68659*** .12960 -5.30 .0000 -.94061 -.43257
TY2|2| -.24102 .14812 -1.63 .1037 -.53133 .04928
TY3|2| -.16998 .15433 -1.10 .2707 -.47246 .13249
M2|2| -.05793 .16353 -.35 .7232 -.37844 .26259
M3|2| -.28550 .18659 -1.53 .1260 -.65121 .08021
HE2|2| -.06837 .11835 -.58 .5635 -.30033 .16359
HE3|2| -.15197 .12901 -1.18 .2388 -.40483 .10090
ASC_B|2| -.19388 .25305 -.77 .4436 -.68985 .30208
ASC_C|2| .38469*** .13191 2.92 .0035 .12615 .64322
|Estimated latent class probabilities
PrbCls1| 0.0 .4136D-08 .00 1.0000 -.81071D-08 .81071D-08
PrbCls2| 1.00000*** .4136D-08 ******** .0000 1.00000 1.00000
--------+--------------------------------------------------------------------
Note: nnnnn.D-xx or D+xx => multiply by 10 to -xx or +xx.
Note: ***, **, * ==> Significance at 1%, 5%, 10% level.
-----------------------------------------------------------------------------
When I ran the model for ;pts=3, Class probabilities remain the same with the probability for one Class having 1.0 and 0.0 for other, as shown below.
|Estimated latent class probabilities
PrbCls1| 0.0 .1464D-07 .00 1.0000 -.28690D-07 .28690D-07
PrbCls2| 1.00000*** .2872D-05 ******** .0000 .99999 1.00001
PrbCls3| 0.0 .2872D-05 .00 1.0000 -.56295D-05 .56298D-05
My questions are;
1. Why does the LCM indicate that the probability of participants belonging to Class 2 is 100%, when the MMNL model indicate there are significant preference heterogeneity for attribute-levels among users.
Is there a fault in my LCM code, or could this result be plausible?
2. If certain demographic variables such as age and sex become statistically non-significant in all Classes, do you recommend removing those non-significant variables and re-running the model, or keeping all socio-demographic data in the model irrespective of their significance?
Thank you so much for your time.
Kind regards,
Sumudu