Page 1 of 1

How to choose among different designs

PostPosted: Tue Nov 18, 2014 8:02 pm
by agracia
Dear all,

First of all, this forum was very useful for me because I found many responses to my problems that I could not find in the manual. However, I am still fighting with my design although it is a very easy one. I am a choice experiment user and previously I used the Burgess and Street design because it is quite easy to generate and in several empirical applications I conducted in the past I got very nice results finally published in good agri-econ journals. However, I read several papers that mentioned the limitations of those designs mainly in terms of dominance and I decided to buy the NGENE to generate an efficient or Bayesian design. I read the manual and it was very clear and I started from the easier to more complicated designs.

Now, I decide to make a post in the forum because after two weeks running different designs, I got many different ones (many times the program was running for hours without giving any feasible design) but I do not know if they are appropriate and I do not have any guideline on how to choose among them.


I want to produce an unlabeled DCE design which has three alternatives where the last one is a no-choice alternative. There are three attributes. The first is the price that it is continuous and fixed (not random) with four levels (3, 5, 7 and 9). The other two attributes are dummy coded with 4 and 2 levels. I have not priors but I plan to conduct a pilot with 10 to 20 respondents. Meanwhile, I am using priors for other papers for agri-food products in my study region or neighborhood regions. The priors are (I am not convince if they are fine but they are the only ones I have available now)

Attribute 1 (price fixed coefficient for the continuous attribute): -1

Attribute 2 with three dummy coded (4 levels)

The mean estimate for the three levels are: 0.9, 0.5 and 0.3 with standard errors of the coefficient equal to 0.10, 0.046 and 0.04
The standard deviation of the mean estimate are: 0.77, 0.7 and 0 with standard errors of 0.118, 0.05 and 0

Attribute 3 with one dummy coded (2 levels)

The mean estimate is: 1 with standard errors of the coefficient equal to 0.12
The standard deviation of the mean estimate is: 1.44 with standard errors of 0.13

I want to generate 8 choice situations in two blocks of 4 and I plan to estimate a rppanel.

I run hundreds of programs but following you advice to other posts in the forum my final two designs (Bayesians) are as follows:


Design
;alts (model1) = alt1*, alt2*,alt3
;rows =8
;block=2
;eff=model1(mnl,d,mean)
;REP=1000
;rdraws=gauss(3)
;bdraws=gauss(3)
;model (model1):
U(alt1) = b1[4.15] + b2[-1] * A[3,5,7,9] + b3.dummy[n,(n,0.9,0.1),(n,0.118,0.77)|n,(n,0.5,0.05),(n,0.05,0.7)|(n,0.3,0.04)] * B[3,2,1,0] +b4.dummy[n,(n,1,0.12),(n,0.13,1.4)]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C
;alg=swap(stop=total(10mins))
$

Design
;alts (model1) = alt1*, alt2*,alt3
;alts (model2) = alt1*, alt2*,alt3
;rows =8
;block=2
;eff=model1(mnl,d,mean)
;REP=1000
;rdraws=gauss(3)
;bdraws=gauss(3)
;model (model1):
U(alt1) = b1[4.15] + b2[-1] * A[3,5,7,9] + b3.dummy[(n,0.9,0.1)|(n,0.5,0.046)|(n,0.3,0.04)] * B[3,2,1,0] +b4.dummy[(n,1,0.12)]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C

;model (model2):
U(alt1) = b1[4.15] + b2[-1] * A[3,5,7,9] + b3.dummy[n,0.9,0.77|n,0.5,0.7|n,0.3,0] * B[3,2,1,0] +b4.dummy[n,1,1.44]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C
;alg=swap(stop=total(10mins))
$

My questions or doubts are:

• Is the programming OK for my design?
• Are those designs appropriate, at least for being included in one pilot to get better priors?
• In general terms, I do not have any guideline on how to choose among designs models because the different measures (d-error, a-error, probabilities, etc) do not have any meaning for me (perhaps I should read some book or papers on that I miss). For me, it was very easy to choose one Burgess and Street design using the d-optimality, the closer to 100 the better but now (for lack of knowledge) I do not know which criteria to apply when I have the output to choose among different designs.

Sorry for my long post but I am so frustrated that I need some help and release. Yesterday, I was so frustrated that I was about to abandon and to use the Burgess and Street design as previously.

Thank you very much

Re: How to choose among different designs

PostPosted: Tue Nov 18, 2014 9:16 pm
by Michiel Bliemer
I quickly ran your syntax and generated a Street & Burgess design, and the best design it could find had a D-efficiency of 0.0032% efficiency, which is of course very bad. The D-error is pretty large. There is no target D-error, because the D-error is case specific, so the lower the better. The D-efficiency measure is not useful, because this measure is only valid when all coefficients are zero. The best measure to optimise on is the D-error, but it is difficult to interpret. So when you inspect a design, the easiest measure to look at is the S-estimates, as they give you an idea how many design replications you need in order to get statistically significant parameter estimates for each parameter. If this value is large (say, >100, or >1000), then you know that it may be difficult to get a good parameter estimate (assuming that your priors are correct).

The main problem you have in your design is your priors. I do not think your priors are consistent with your attribute levels. A constant of 4.15 is very large, and a prior of -1 with attribute levels 3,5,7,9 makes the A attribute very dominant. I would suspect that your priors are too large, which causes the D-error to become quite high because one of the alternatives is likely to become dominant. If you are unsure about the priors, it is best to take conservative values (i.e. closer to zero). Remember that Street and Burgess implicitly assume that all priors are zero, so any value closer to the true value will already improve the efficiency of your design.

So I would suggest the following:
1. Carefully look at your priors
2. Run your syntax and check the S-estimates, and check the choice probabilities (you do not want to have many alternatives that are chosen almost 0% or 100%)
3. If you would like to compare, you can generate a Street and Burgess design by setting ;orth = ood and compare the D-error of that design with your other designs

You can also consider making your syntax and design less complicated. I would simply opt for a Bayesian design for the MNL model and not bother to evaluate for a panel mixed logit model, especially for a pilot study.

I hope this helps. Note that I will be teaching an experimental design course the first week of December in Leeds, UK, if you are around :)

Re: How to choose among different designs

PostPosted: Tue Dec 16, 2014 7:16 am
by agracia
Dear Michel Bliemer:

Thank you very much for your quick response. I hope you had a very good time in Leeds. I would like to have joined you in the course but the week before I had two meetings in the Netherlands and I could not be out of home two weeks in a row.

I followed your suggestions and I designed a Bayesian for MNL with different prior for the price and I think that my design improve quite a lot.
I used the following sintax:

Design
? This will generate a bayesian design for MNL
;alts = alt1*, alt2*,alt3
;rows=12
;BLOCK=3
;eff=(mnl,d)
;model:
U(alt1) = b2[-0.5] * A[3,5,7,9] + b3.dummy[(n,1,0.05)|(n,0.8,0.05)|(n,0.7,0.05)] * B[3,2,1,0] +b4.dummy[(n,1.1,0.1)]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C
;alg=swap(stop=total(10mins))
$

I did a pilot (real experiment, with real products) and I realized that one of the levels for the four-dummy attribute was not realistic for participants (also I check in the regulation and this level will disappear in few months because of a new regulation). Then, I need to change the levels for this attribute to three levels. Then, I also changed the price levels to three because the higher price was also very high for participants. However, I cannot change the other two level attribute to three because it is the presence of a label in the product, so, it has only two levels (with label and without label). The new variable-levels are:
Price (continuous and fixed) with three levels (3, 5 and 7).
Dummy coded with 3 (produced in the county, produce in the region, produce in Spain)
Dummy coded with 2 levels (with label, without label).

The new sintax is:

Design
? This will generate a bayesian design for MNL
;alts = alt1*, alt2*,alt3
;rows=12
;BLOCK=3
;eff=(mnl,d)
;model:
U(alt1) = b2[-0.5] * A[3,5,7] + b3.dummy[(n,1,0.05)|(n,0.8,0.05)] * B[2,1,0] +b4.dummy[(n,1.1,0.1)]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C
;alg=swap(stop=total(10mins))
$

If I checked the statistics, I think that the last model is better than the first one and it is more realistic, however, I read in the NGENE manual that you suggested not mixing attributes with odd and even levels.
My questions are:
• Does the last design with mixed (odd and even) levels appropriate or in other words, which kind of problems I could have with this design?
• If important problems exist, what alternative design could I use to avoid the problem while maintaining the realism of the design?.

I plan to conduct other pilot before the design of the final choice sets.

Best regards

Re: How to choose among different designs

PostPosted: Tue Dec 16, 2014 7:40 am
by Michiel Bliemer
You can safely mix odd and even attribute level numbers, the only thing that you usually keep in mind is attribute level balance, so that each level appears an equal number of times within the design. If you mix odd and even attribute levels, you usually need a larger design to satisfy attribute level balance. With 12 choice tasks, you can mix 2 levels with 3 levels, 4 levels and 6 levels without any problems.

Re: How to choose among different designs

PostPosted: Fri Jan 23, 2015 2:31 am
by agracia
Dear Michel Bliemer:

Thank you very much for your quick response.

Finally, I used the following attributes-levels to design my choice experiment:

Price (continuous and fixed) with three levels (3, 5 and 8).
Dummy coded with 3 levels (produced in the county, produced in the region, produced in Spain)
Dummy coded with 2 levels (with label, without label).

The syntax used was:
Design
? This will generate a bayesian design for MNL
;alts = alt1*, alt2*,alt3
;rows=12
;eff=(mnl,d)
;model:
U(alt1) = b2[-0.1] * A[3,5,8] + b3.dummy[(n,0.5,0.05)|(n,0.4,0.05)] * B[2,1,0] +b4.dummy[(n,0.3,0.01)]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C
;alg=swap(stop=total(10mins))
$

I thought that this design was quite good in terms of statistical results and I used it in one pilot with 30 people. I estimated my choice model (MNL and MMNL) using this information.

Then, I used the estimated parameters for these models to design an efficient design, first for a MNL and second for aMMNL and I got very frustrated because the statistics (mainly the D-error and the choice probabilities) were worse than from the previous design.

My estimated parameters for the MNL and MMNL were statistically significant with the following values:



For the MNL
Fixed parameters
ASC (constant) = 3.32
Price (b2)= -0.496
Random parameters
Dummy for produced in the county (b3)=0.527
Dummy for produced in the region (b3)=0.78
Dummy for label (b4)=0.81

For the MMNL

Fixed parameters
ASC(constant) = 4.39
Price (b2)=-0.737
Random parameters
Estimation of the Mean
Dummy for produced in the county (b3) =0.868
Dummy for produced in the region (b3) = 1.326
Dummy for label (b4)=1.416

Standard deviation of the mean parameters

Dummy for produced in the county (b3)=1.326
Dummy for produced in the region (b3)=1.054
Dummy for label (b4)=1.979

My designs are:

Design
? This will generate an efficient design for MNL
;alts = alt1*, alt2*,alt3
;rows=12
;block=3
;eff=(mnl,d)
;model:
U(alt1) = b1[3.32]+b2[-0.5] * A[3,5,8] + b3.dummy[0.52|0.78] * B[2,1,0] +b4.dummy[0.81]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C
;alg=swap(stop=total(10mins))
$


And

Design
? This will generate an efficient design for MMNL
;alts = alt1*, alt2*,alt3
;rows=12
;block=3
;eff=(rppanel,d)
;rep=500
;rdraws=halton(100)
;model:
U(alt1) = b1[4.4]+b2[-0.74] * A[3,5,8] + b3.dummy[n,0.868,1.326|n,1.326,1.05] * B[2,1,0] +b4.dummy[n,1.41,1.979]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C
;alg=swap(stop=total(60mins))
$

My questions are:

• Are the designs correct? I wonder if the standard deviations you mention in the NGENE manual are the standard deviations of the mean parameters: This is my understanding from my reading and the ones used in the design
• Which of my designs would you use for my final choice experiment? My next step is to conduct the experiment with 200 participants and the experiment is real, so, I will pay them and they will attend a session of around 10 people. This is very expensive and time consuming and I am very afraid of using a bad design and waste the money and the time.
Best regards

Azucena

Re: How to choose among different designs

PostPosted: Sat Jan 24, 2015 12:15 pm
by Michiel Bliemer
I am not sure I understand your questions.

You generate your designs based on some priors. Then you collected some data and estimated the coefficients again, giving you new priors. Clearly, the statistics such as D-error and S-estimates etc are only correct if your priors are correct. If your coefficients that you obtain are different, then you get different outcomes. You cannot compare designs based on different priors.

I do not see a price, country, etc attributes in your design syntax, so it is difficult for me to understand what is what. It seems that your coefficients are quite different from your priors, which means you lost some efficiency, but you can still estimate your model, so no problem there.

I do not understand what you mean with standard deviation. For the MMNL model you need the mean parameter and the standard deviation parameter, which is a different separate parameter that you did not provide in your message. Please do not confuse the standard deviation with the standard error of the parameter! You use standard error for Bayesian priors, and you use standard deviations in randomly distributed coefficients.

Just use your estimated coefficients to generate a new design using Bayesian priors and you should get a good design.

Re: How to choose among different designs

PostPosted: Mon Jan 26, 2015 8:33 am
by agracia
Dear Michel Bliemer:

Thank you very much for your quick response. I am sorry that you did not understand my questions. I will explain again.

Finally, I used the following attributes-levels to design my choice experiment:

Price (continuous and fixed) with three levels A (3, 5 and 8).
Dummy coded with 3 levels (produced in the county, produced in the region, produced in Spain) B (2,1,0)
Dummy coded with 2 levels (with label, without label) C(1,0)
Using the following syntax
Design
? This will generate a bayesian design for MNL
;alts = alt1*, alt2*,alt3
;rows=12
;eff=(mnl,d)
;model:
U(alt1) = b2[-0.1] * A[3,5,8] + b3.dummy[(n,0.5,0.05)|(n,0.4,0.05)] * B[2,1,0] +b4.dummy[(n,0.3,0.01)]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C
;alg=swap(stop=total(10mins))
$

I got a choice experiment quite good in terms of statistical results and I used it in one pilot with 30 people. I estimated my choice model (MNL and MMNL) using this information.


My estimated parameters for the MNL and MMNL were statistically significant with the following values:

For the MNL
Fixed parameters
ASC (constant) = 3.32 (standard error 0.363)
Price (b2)= -0.496 (standard error 0.052)
Random parameters
Dummy for produced in the county (b3 for level 2)=0.527 (standard error 0.206)
Dummy for produced in the region (b3 for level 1)=0.78 (standard error 0.24)
Dummy for label (b4 for level 1)=0.81 (standard error 0.17)

For the MMNL

Fixed parameters
ASC(constant) = 4.39 (standard error 0.509)
Price (b2)=-0.737 (0.082)
Random parameters
Estimation of the Mean
Dummy for produced in the county (b3 for level 2) =0.868 (standard error 0.379)
Dummy for produced in the region (b3 for level 1) = 1.326 (standard error 0.39)
Dummy for label (b4 for level 1)=1.416 (standard error 0.43)

Standard deviation of the mean parameters

Dummy for produced in the county (b3 for level 2)=1.326 (standard error 0.372)
Dummy for produced in the region (b3 for level 1)=1.054 (standard error 0.334)
Dummy for label (b4 for level 1)=1.979 (standard error 0.38)

Because you suggested in your last response to use a Bayesian, I run the following syntax:

Design
;alts (model1) = alt1*, alt2*,alt3
;rows =12
;block=3
;eff=model1(rppanel,d,mean)
;REP=1000
;rdraws=gauss(3)
;bdraws=gauss(3)
;model (model1):
U(alt1) = b1[4.39] + b2[(n,0.763,0.082)] * A[3,5,8] + b3.dummy[n,(n,0.868,0.379),(n,1.32,0.372)|n,(n,1.32,0.39),(n,1.05,0.334)] * B[2,1,0] +b4.dummy[n,(n,1.416,0.43),(n,1.979,0.38)]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C
;alg=swap(stop=total(10mins))

$

But I got an undefined result.

Then, I read one of the forum post and I follow your suggestion to run a mnl Bayesian model but to evaluate an efficient rppanel and I run the following syntax:

Design
;alts (model1) = alt1*, alt2*,alt3
;alts (model2) = alt1*, alt2*,alt3
;rows =12
;block=3
;eff=model1(mnl,d,mean)
;REP=1000
;rdraws=gauss(3)
;bdraws=gauss(3)
;model (model1):
U(alt1) = b1[3.32] + b2[(n,-0.495,0.052)] * A[3,5,8] + b3.dummy[(n,0.5266,0.206)|(n,0.78,0.24)] * B[2,1,0] +b4.dummy[(n,0.81,0.17)]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C

;model (model2):
U(alt1) = b1[4.39] + b2[-0.7631] * A[3,5,8] + b3.dummy[n,0.868,1.32|n,1.32,1.05] * B[2,1,0] +b4.dummy[n,1.416,1.979]*c[1,0] /
U(alt2) = b2 * A + b3 * B +b4*C
;alg=swap(stop=total(10mins))

$

The design looks nice but the probabilities are not balance between alternatives.

I had also run other syntax removing the constant parameter (b1) from the previous one, as you mention that for unlabeled designs ASC is not important and, the d-error is a bit better.
My questions are:

• Are the different syntax correct?
• Which design would you use for my final experiment?
Thank you very much

Re: How to choose among different designs

PostPosted: Mon Jan 26, 2015 12:06 pm
by Michiel Bliemer
To me it seems illogical to have such a huge ASC in one of the alternatives, while you claim they are generic. It should not have a constant there, the constant should be in alt3 to make sense. The unbalanced probabilities likely are due to this ASC, which is very large and drives the probabilities.

The syntax all seems fine, you use the standard errors and standard deviations correctly. I am just worried abou the model you are trying to estimate, which to me makes no sense, but you know better what you would like to estimate.

Re: How to choose among different designs

PostPosted: Mon Jan 26, 2015 12:08 pm
by Michiel Bliemer
Btw, the likely reason that you got an undefined result is because you are using a normally distributed Bayesian prior for the standard deviation of a normally distributed coefficient. Clearly, the standard deviation cannot be negative and as such you should use a uniform distribution for the standard deviation prior.