choice-metrics.com

by **Fatoumata** » Wed Jun 03, 2020 2:13 am

Hello,

First of all, thank you for your avaibility and your investment in this forum !

After a pre-test to determine the prior, I’ve ran several analyses implying mainly the MNL and MMNL models. With regards to the results some coefficients were not significant and few had wrong signs. I got however better results with the MMNL specification especially when I’ve included very specific interactions.

Furthermore, I’ve found that the IIA property is violated, that there’s indeed observable heterogeneity and that stastically the MMNL perform better. So the MMNL panel is the main model that I want to estimate in the final stage (there’s also latent class model).

However, after the lecture of the manual and posts I still have some interrogations, if you can help me please :

- Should I use the results of the MMNL or MNL estimation when I determine design in NGene under a MNL specification ? Or should I use the mmnl results combined ? In both cases is it better to use bayesian specification ?

- When can it be estimated that the priors are known with certitude ? When there’s a pilot study with significant and coherent coefficient ? For example in the case of MMNL when assumption are made about the law followed by the distributions, but that the choice is not sure mainly based on what is mainly used in the general litterature and what fit better the data, is it better to use bayesian design ?

- Regarding the inclusion of interactions, I have two kind of covariates.

A continuous « Q8r1 » that I’ve interacted with an attribute PS : « + b8[(n,0.1, 0.04 )]* Q8r1.covar[1,2,3,4,5,6]* PS[4,2,0] ». It’s included in all my alternatives except in the status quo, I precise that my status quo contains specific level for some attributes plus an ASC.

Another categorical with 3 levels that has to be interacted with a specific continuous variables. For this subject, I have a doubt the method presented in the manual is appropriate. Since it seems to be more appropriate for the study of potential heterogeneity at the model scale (kind of potential segmentation) rather than at the attribute scale.

For remember, I want to include at the design stage these interactions because they’ve allowed me in the pre-test analysis to get more coherent results. Thus, I assume that I might used them in the final estimation. So it’s not a question of main interest, it’s rather a new input based on statistical analysis on a sample of about 100 individuals where there’s an overepresentation of a certain type of people which may explain this need. So, do you think it is better to continue with these interactions or to use bayesian specification such as ([u,-1,0]), ([u,0,1]), ([n,1,0]) depending on the assumption on the sign.

- Do I’ll need to increase the number of choice cards cause of the inclusion of these covariates ?

- In the case of an unlabelled experiment I have included one ASC for a status quo alternative. I was wondering if effects of the dummies references are eventually captured in it, or nowhere since there is no another constant ?

Below, an example of one of what I actually try to implement to illustrate, there’s also the code used for the pretest. Note that my design is constituted of an unforced and forced choices and that I want to optimize mainly on the unforced choice. For the categorical covariates (of 3 levels) I’ve tried to followed what I’ve understand based on the manual and integrated it with an interaction. Additionnaly I’ve adapted the prior based on the coefficients obtained during the estimation and not kept the same like it is in the manual for gender.

TRIAL CODE
Design
;alts(M1a) = OptionA*, OptionB*, sq
;alts(M1b) = OptionA*, OptionB*, sq
;alts(M1c) = OptionA*, OptionB*, sq
;alts(M2a) = OptionA*, OptionB*
;alts(M2b) = OptionA*, OptionB*
;alts(M2c) = OptionA*, OptionB*
;eff = 3*M1a(mnl,d) + 3*M1b(mnl,d) + 3*M1c(mnl,d) + M2a(mnl,d) + M2b(mnl,d) + M2c(mnl,d)
;rows = 21
;block = 3
;alg = mfederov(candidates = 1000)
;fisher(F1) = des1(M1a[0.23], M1b[0.23], M1c[0.24], M2a[0.1], M2b[0.1],M2c[0.1])

;reject:
OptionA.der=20 and OptionA.eva=20,
OptionA.der=10 and OptionA.eva=20,
OptionA.der=20 and OptionA.eva=10,
OptionA.der=10 and OptionA.eva=10,
OptionB.der=20 and OptionB.eva=20,
OptionB.der=10 and OptionB.eva=20,
OptionB.der=20 and OptionB.eva=10,
OptionB.der=10 and OptionB.eva=10

;require:
sq.IP = 1,
sq.CT = 1

;model(M1a):
U(OptionA) = b1[(n,-0.008,0.018)]*DER[20,10,1] + b2[(n,-0.001,0.03)]*EVA[20,10,0] + b3[(n,0.01,0.028)]*RF [75,50,25] + b4.dummy[(n,0.93,0.08)|(n,0.16,0.3)]*IP[3,2,1] + b5.dummy[(n,0.75,0.6)|(n,0.37,0.57)]*CT[3,2,1] + b6[(n,-0.2,0.2)]*PS[3,2,0] + b7[(n,-0.00003, 0.00001)]*P[13000,10000,7000,50000,2500](3-5,3-5,3-4,3-4,3-4) + b8[(n,0.1, 0.04)]*Q8r1.covar[1,2,3,4,5,6]*PS + IT[(n,-0.016,0.008)]*RF*H2.covar[1]/
U(OptionB)= b1*DER + b2*EVA + b3*RF + b4*IP+ b5*CT+ b6*PS + b7*P + b8*Q8r1*PS + IT*RF*H2/
U(sq)= b0[0] + b1*DERsq[0] + b2*EVAsq[0] + b3*RFsq[0] + b4*IP + b5*CT + b6*PSsq[0] + b7*Psq[0] + b8*Q8r1*PSsq

;model(M1b):
U(OptionA) = b1[(n,-0.008,0.018)]*DER[20,10,1] + b2[(n,-0.001,0.03)]*EVA[20,10,0] + b3[(n,0.01,0.028)]*RF [75,50,25] + b4.dummy[(n,0.93,0.08)|(n,0.16,0.3)]*IP[3,2,1] + b5.dummy[(n,0.75,0.6)|(n,0.37,0.57)]*CT[3,2,1] + b6[(n,-0.2,0.2)]*PS[3,2,0] + b7[(n,-0.00003, 0.00001)]*P[13000,10000,7000,50000,2500](3-5,3-5,3-4,3-4,3-4) + b8[(n,0.1, 0.04)]*Q8r1.covar[1,2,3,4,5,6]*PS + IT[(n,-0.007,0.01)]*RF*H2.covar[2]/
U(OptionB)= b1*DER + b2*EVA + b3*RF + b4*IP+ b5*CT+ b6*PS + b7*P + b8*Q8r1*PS + IT*RF*H2/
U(sq)= b0[0] + b1*DERsq[0] + b2*EVAsq[0] + b3*RFsq[0] + b4*IP + b5*CT + b6*PSsq[0] + b7*Psq[0] + b8*Q8r1*PSsq

;model(M1c):
U(OptionA) = b1[(n,-0.008,0.018)]*DER[20,10,1] + b2[(n,-0.001,0.03)]*EVA[20,10,0] + b3[(n,0.01,0.028)]*RF [75,50,25] + b4.dummy[(n,0.93,0.08)|(n,0.16,0.3)]*IP[3,2,1] + b5.dummy[(n,0.75,0.6)|(n,0.37,0.57)]*CT[3,2,1] + b6[(n,-0.2,0.2)]*PS[3,2,0] + b7[(n,-0.00003, 0.00001)]*P[13000,10000,7000,50000,2500](3-5,3-5,3-4,3-4,3-4) + b8[(n,0.1, 0.04)]*Q8r1.covar[1,2,3,4,5,6]*PS + IT[0]*RF*H2.covar[0]/
U(OptionB)= b1*DER + b2*EVA + b3*RF + b4*IP+ b5*CT+ b6*PS + b7*P + b8*Q8r1*PS + IT*RF*H2/
U(sq)= b0[0] + b1*DERsq[0] + b2*EVAsq[0] + b3*RFsq[0] + b4*IP + b5*CT + b6*PSsq[0] + b7*Psq[0] + b8*Q8r1*PSsq

;model(M2a):
U(OptionA) = b1[(n,-0.008,0.018)]*DER[20,10,1] + b2[(n,-0.001,0.03)]*EVA[20,10,0] + b3[(n,0.01,0.028)]*RF [75,50,25] + b4.dummy[(n,0.93,0.08)|(n,0.16,0.3)]*IP[3,2,1] + b5.dummy[(n,0.75,0.6)|(n,0.37,0.57)]*CT[3,2,1] + b6[(n,-0.2,0.2)]*PS[3,2,0] + b7[(n,-0.00003, 0.00001)]*P[13000,10000,7000,50000,2500](3-5,3-5,3-4,3-4,3-4) + b8[(n,0.1, 0.04)]*Q8r1.covar[1,2,3,4,5,6]*PS + IT[(n,-0.016,0.008)]*RF*H2.covar[1]/
U(OptionB)= b1*DER + b2*EVA + b3*RF + b4*IP+ b5*CT+ b6*PS + b7*P + b8*Q8r1*PS + IT*RF*H2

;model(M2b):
U(OptionA) = b1[(n,-0.008,0.018)]*DER[20,10,1] + b2[(n,-0.001,0.03)]*EVA[20,10,0] + b3[(n,0.01,0.028)]*RF [75,50,25] + b4.dummy[(n,0.93,0.08)|(n,0.16,0.3)]*IP[3,2,1] + b5.dummy[(n,0.75,0.6)|(n,0.37,0.57)]*CT[3,2,1] + b6[(n,-0.2,0.2)]*PS[3,2,0] + b7[(n,-0.00003, 0.00001)]*P[13000,10000,7000,50000,2500](3-5,3-5,3-4,3-4,3-4) + b8[(n,0.1, 0.04)]*Q8r1.covar[1,2,3,4,5,6]*PS + IT[(n,-0.007,0.01)]*RF*H2.covar[2]/
U(OptionB)= b1*DER + b2*EVA + b3*RF + b4*IP+ b5*CT+ b6*PS + b7*P + b8*Q8r1*PS + IT*RF*H2

;model(M2c):
U(OptionA) = b1[(n,-0.008,0.018)]*DER[20,10,1] + b2[(n,-0.001,0.03)]*EVA[20,10,0] + b3[(n,0.01,0.028)]*RF [75,50,25] + b4.dummy[(n,0.93,0.08)|(n,0.16,0.3)]*IP[3,2,1] + b5.dummy[(n,0.75,0.6)|(n,0.37,0.57)]*CT[3,2,1] + b6[(n,-0.2,0.2)]*PS[3,2,0] + b7[(n,-0.00003, 0.00001)]*P[13000,10000,7000,50000,2500](3-5,3-5,3-4,3-4,3-4) + b8[(n,0.1, 0.04)]*Q8r1.covar[1,2,3,4,5,6]*PS + IT[0]*RF*H2.covar[0]/
U(OptionB)= b1*DER + b2*EVA + b3*RF + b4*IP+ b5*CT+ b6*PS + b7*P + b8*Q8r1*PS + IT*CT*H2$

[b]DESIGN for pretest
[/b]

Design
;alts(M1) = OptionA*, OptionB*, sq
;alts(M2) = OptionA*, OptionB*
;eff = 2*M1(mnl,d)+ M2(mnl,d)
;rows = 45
;block = 7
;alg = mfederov(candidates = 1000)

;reject:
OptionA.der=40 and OptionA.eva=40,
OptionA.der=20 and OptionA.eva=40,
OptionA.der=40 and OptionA.eva=20,
OptionA.der=20 and OptionA.eva=20,
OptionB.der=40 and OptionB.eva=40,
OptionB.der=20 and OptionB.eva=40,
OptionB.der=40 and OptionB.eva=20,
OptionB.der=20 and OptionB.eva=20

;require:
sq.IP=1,
sq.CT=1

;model(M1):
U(OptionA) = b1[-0.03]*DER[40,20,1] + b2[-0.04]*EVA[40,20,0] + b3.effects[0.59|0.39]*IP[3,2,1] + b4.effects[0.47|0.31]*CT[3,2,1] + b5[-0.25]*PS[4,2,0] + + b6[0.01]*RF [60,30,20] + b7[-0.00005]*P[27000,20000,15000,10000,5000](5-12,8-15,10-20,10-20,10-20) /
U(OptionB)= b1*DER + b2*EVA + b3*IP+ b4*CT+ b5*PS + b6*RF + b7*P /
U(sq)= b0[0] + b1*DERsq[0] + b2*EVAsq[0] + b3*IP + b4*CT + b5*PSsq[0]+ b6*RFsq[0] + b7*Psq[0]

;model(M2):
U(OptionA) = b1[-0.03]*DER[40,20,1] + b2[-0.04]*EVA[40,20,0] + b3.effects[0.59|0.39]*IP[3,2,1] + b4.effects[0.47|0.31]*CT[3,2,1] + b5[-0.25]*PS[4,2,0] + + b6[0.01]*RF [60,30,20] + b7[-0.00005]*P[27000,20000,15000,10000,5000](5-12,8-15,10-20,10-20,10-20) /
U(OptionB)= b1*DER + b2*EVA + b3*IP+ b4*CT+ b5*PS + b6*RF + b7*P $

by **Michiel Bliemer** » Wed Jun 03, 2020 10:01 am

Some answers:

- Optimising for the (panel) mixed logit model will not be possible in your case because you have too many coefficients and optimising would require running Ngene for months or years. Therefore, I always recommend using Bayesian priors based on MNL model estimates. If some have the wrong sign, then I would suggest making small manual corrections, these priors are merely best guesses.

- I am not sure I understand the question, but using a Bayesian efficient design optimised for the MNL model is state-of-the-art and becoming state-of-the-practice, there are not many people optimising for the MMNL model

- it is up to you to specify the utility function that you believe is most appropriate, this is different for every study. Note that it is rare to optimise for covariates in the design, I generally only optimise for attribute levels in the design and only add sociodemographics in the model estimation phase. I would recommend to make things simple: specify an MNL model without covariates and optimise using Bayesian priors and sufficient rows (choice cards as you call it) so that you can always estimate additional interaction effects in the model estimation stage.

- The more coefficients you want to estimate, the more rows you need in the design. You may want to choose a larger number of rows and blocks.\

- If the reference level of the dummy variables does not appear in optionA or optionB, then the ASC for the SQ will also contain the utilities of the reference levels. If the reference levelof the dummy variables does appear in option A/B, then the ASC will not contain the utility of the reference level.

Given that your syntax is very complex, it is not easy for me to give you advise. But please note that for Bayesian priors, as you have specified, you need to use ;eff = (mnl,d,mean), i.e. you need to add "mean" or "median", and you need to specify the number of draws. Given the large number of Bayesian priors, I suggest using ;bdraws = sobol(5000).

Michiel

by **Fatoumata** » Wed Jun 03, 2020 5:49 pm

Thank you for your reply !

choice-metrics.com

final design for MMNL model with covariates interaction

final design for MMNL model with covariates interaction

Re: final design for MMNL model with covariates interaction

Re: final design for MMNL model with covariates interaction

Who is online