Page 1 of 1

CLM of DCE pilot data: results = 'NA/0.00000', issue is unkn

PostPosted: Wed Sep 20, 2023 12:55 am
by neridaf
Hello all,

When estimating the conditional logit model for the pilot phase of the discrete choice experiment I am conducting, the results are returning NA values and I am trying to discern whether it is an issue with my analysis, my data or the design of my DCE (or something else). I designed the DCE using Ngene software, delivered the questionnaire via Qualtrics and used R to conduct analysis.

I cannot find an example of so many NA/0.00 in the literature, and the explanations posted in various online forums have suggested issues with sample size or the creation of the dummy coding for the attribute levels. My sample size is 67 and is more than 20% of the target sample of 300 observations suggested for DCEs.

Has anyone encountered this before, and if so were you able to identify the cause of the anomaly?

Some information:

Design:
Unlabelled sequential orthogonal factorial design with four blocks, comprised of five attributes with four levels and one attribute with two levels.

The syntax used in Ngene:
Design
;alts = alt1, alt2
;rows = 36
;orth = seq
;block = 4
;model:
U(alt1) = b1 + b2 * A[0,1,2,3] + b3 * B[4,5,6,7] + b4 * C[8,9,10,11] + b5 * D[12,13,14,15] + b6 * E[16,17,18,19] + b7 * F[20,21] /
U(alt2) = b2 * A + b3 * B + b4 * C + b5 * D + b6 * E + b7 * F $

Please note: The levels were numbered 0-21 in the design, however during data transformation the levels were renumbered 1-4 for each attribute (as appropriate) and hence is the numbering system used in the analysis and results.

R code for CLM analysis DCE data
## Run CLM analysis ----
dt <- dataset
results.CLM <- clogit(choice2 ~ A1.L1 + A1.L2 + A1.L3 + A1.L4 + A2.L1+ A2.L2 + A2.L3 + A2.L4 + A3.L1 + A3.L2+ A3.L3+ A3.L4 + A4.L1+ A4.L2 + A4.L3 + A4.L4 + A5.L1 + A5.L2 + A5.L3 + A5.L4 + A6.L1+ A6.L2+strata(cs) , method=c("efron"),data=dt)

Results:
n= 1206, number of events= 597

coef exp(coef) se(coef) z Pr(>|z|)
A1.L1 -2.47472 0.08419 0.39695 -6.234 4.54e-10 ***
A1.L2 -1.63359 0.19523 0.52368 -3.119 0.00181 **
A1.L3 -1.04222 0.35267 0.35822 -2.909 0.00362 **
A1.L4 NA NA 0.00000 NA NA
A2.L1 0.89475 2.44672 0.38371 2.332 0.01971 *
A2.L2 0.11876 0.88802 0.60879 -0.195 0.84534
A2.L3 0.61404 1.84787 0.43516 1.411 0.15823
A2.L4 NA NA 0.00000 NA NA
A3.L1 1.64544 5.18330 0.36104 4.558 5.18e-06 ***
A3.L2 -0.46289 0.62946 0.41341 -1.120 0.26285
A3.L3 0.20944 1.23299 0.41485 0.505 0.61365
A3.L4 A NA 0.00000 NA NA
A4.L1 -0.08974 0.91417 0.45581 -0.197 0.84392
A4.L2 0.42250 1.52577 0.30532 1.384 0.16642
A4.L3 NA NA 0.00000 NA NA
A4.L4 NA NA 0.00000 NA NA
A5.L1 NA NA 0.00000 NA NA
A5.L2 NA NA 0.00000 NA NA
A5.L3 NA NA 0.00000 NA NA
A5.L4 NA NA 0.00000 NA NA
A6.L1 NA NA 0.00000 NA NA
A6.L2 NA NA 0.00000 NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

exp(coef) exp(-coef) lower .95 upper .95
A1.L1 0.08419 11.8784 0.03867 0.1833
A1.L2 0.19523 5.1222 0.06995 0.5449
A1.L3 0.35267 2.8355 0.17477 0.7117
A1.L4 NA NA NA NA
A2.L1 2.44672 0.4087 1.15337 5.1904
A2.L2 0.88802 1.1261 0.26929 2.9284
A2.L3 1.84787 0.5412 0.78752 4.3359
A2.L4 NA NA NA NA
A3.L1 5.18330 0.1929 2.55442 10.5177
A3.L2 0.62946 1.5887 0.27995 1.4153
A3.L3 1.23299 0.8110 0.54681 2.7802
A3.L4 NA NA NA NA
A4.L1 0.91417 1.0939 0.37415 2.2336
A4.L2 1.52577 0.6554 0.83869 2.7757
A4.L3 NA NA NA NA
A4.L4 NA NA NA NA
A5.L1 NA NA NA NA
A5.L2 NA NA NA NA
A5.L3 NA NA NA NA
A5.L4 NA NA NA NA
A6.L1 NA NA NA NA
A6.L2 NA NA NA NA

Concordance= 0.73 (se = 0.026 )
Likelihood ratio test= 180.6 on 11 df, p=<2e-16
Wald test = 115.3 on 11 df, p=<2e-16
Score (logrank) test = 158.3 on 11 df, p=<2e-16

Example dataset
cs PartID qname alt PartChoice Choice choice2 A1 A2 A3 A4 A5 A6 A1.L1 A1.L2 A1.L3 A1.L4 A2.L1 A2.L2 A2.L3 A2.L4 A3.L1 A3.L2 A3.L3 A3.L4 A4.L1 A4.L2 A4.L3 A4.L4 A5.L1 A5.L2 A5.L3 A5.L4 A6.L1 A6.L2
1 1 Q8.1_1 1 2 FALSE 0 1 2 2 3 4 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0
1 1 Q8.1_1 2 2 TRUE 1 1 2 3 2 2 1 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0
2 3 Q8.1_1 1 1 TRUE 1 1 2 2 3 4 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0
2 3 Q8.1_1 2 1 FALSE 0 1 2 3 2 2 1 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0
3 4 Q7.1_1 1 2 FALSE 0 1 3 4 1 1 2 1 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 1 0 0 0 0 1
3 4 Q7.1_1 2 2 TRUE 1 2 1 2 4 2 2 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1
4 7 Q6.1_1 1 2 FALSE 0 4 2 4 2 4 2 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1
4 7 Q6.1_1 2 2 TRUE 1 2 3 3 4 4 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0
5 9 Q7.1_1 1 1 TRUE 1 1 3 4 1 1 2 1 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 1 0 0 0 0 1
5 9 Q7.1_1 2 1 FALSE 0 2 1 2 4 2 2 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1
6 10 Q6.1_1 1 2 FALSE 0 4 2 4 2 4 2 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1
6 10 Q6.1_1 2 2 TRUE 1 2 3 3 4 4 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0
7 11 Q8.1_1 1 2 FALSE 0 1 2 2 3 4 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0
7 11 Q8.1_1 2 2 TRUE 1 1 2 3 2 2 1 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0
8 12 Q9.1_1 1 1 TRUE 1 4 1 1 1 1 1 0 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0
8 12 Q9.1_1 2 1 FALSE 0 3 3 2 2 3 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 1 0
9 13 Q7.1_1 1 1 TRUE 1 1 3 4 1 1 2 1 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 1 0 0 0 0 1
9 13 Q7.1_1 2 1 FALSE 0 2 1 2 4 2 2 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1
10 14 Q7.1_1 1 2 FALSE 0 1 3 4 1 1 2 1 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 1 0 0 0 0 1
10 14 Q7.1_1 2 2 TRUE 1 2 1 2 4 2 2 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1

Data columns - key
cs unique number for each individual choice set answered by a participant
PartID participant identifier
qname question name as listed by qualtrics in exported data, indicates block and question
alt indicates the option in the choice set (1 = option A; 2 = option B)
PartChoice the option selected by the participant (1 = option A; 2 = option B)
Choice indicates whether the participant selected the option described in the row (TRUE = participant selected that option; FALSE = participant selected alternative option)
choice2 same as 'Choice', represented numerically (1 = participant selected that option; 0 = participant selected alternative option)
A1 indicates the level of attribute 'A1' for that option of the choice set (1-4)
A2 indicates the level of attribute 'A2' for that option of the choice set (1-4)
A3 indicates the level of attribute 'A3' for that option of the choice set (1-4)
A4 indicates the level of attribute 'A4' for that option of the choice set (1-4)
A5 indicates the level of attribute 'A5' for that option of the choice set (1-4)
A6 indicates the level of attribute 'A6' for that option of the choice set (1-2)
A1.L1 dummy coding for levels of attribute 'A1'
A1.L2 dummy coding for levels of attribute 'A1'
A1.L3 dummy coding for levels of attribute 'A1'
A1.L4 dummy coding for levels of attribute 'A1'
A2.L1 dummy coding for levels of attribute 'A2'
A2.L2 dummy coding for levels of attribute 'A2'
A2.L3 dummy coding for levels of attribute 'A2'
A2.L4 dummy coding for levels of attribute 'A2'
A3.L1 dummy coding for levels of attribute 'A3'
A3.L2 dummy coding for levels of attribute 'A3'
A3.L3 dummy coding for levels of attribute 'A3'
A3.L4 dummy coding for levels of attribute 'A3'
A4.L1 dummy coding for levels of attribute 'A4'
A4.L2 dummy coding for levels of attribute 'A4'
A4.L3 dummy coding for levels of attribute 'A4'
A4.L4 dummy coding for levels of attribute 'A4'
A5.L1 dummy coding for levels of attribute 'A5'
A5.L2 dummy coding for levels of attribute 'A5'
A5.L3 dummy coding for levels of attribute 'A5'
A5.L4 dummy coding for levels of attribute 'A5'
A6.L1 dummy coding for levels of attribute 'A6'
A6.L2 dummy coding for levels of attribute 'A6'


Please let me know if you would prefer to see the above information in a different format, and I will endeavour to provide it.

I will be extremely grateful for any guidance or suggestions that anyone can provide.

Kind Regards,

Nerida

Re: CLM of DCE pilot data: results = 'NA/0.00000', issue is

PostPosted: Wed Sep 20, 2023 3:27 pm
by Michiel Bliemer
It looks like an over-specification issue due to incorrect data coding.

It looks like you are including all attribute levels as a variable in the utility function, but you cannot estimate all their coefficients. Did you apply dummy coding? For an attribute with 4 levels, you should only estimate 3 coefficients while one coefficient is normalised to zero, corresponding to the base level.

Michiel

Re: CLM of DCE pilot data: results = 'NA/0.00000', issue is

PostPosted: Thu Sep 21, 2023 12:44 am
by neridaf
Thank you for your response Michiel. I will adjust accordingly.

Kind Regards,

Nerida