choice-metrics.com

by **Steven Guu** » Sun Jun 19, 2022 12:11 pm

Dear fellow researchers

Currently, I am designing a DCE pilot study in Ngene.
I plan to use Bayesian efficient experimental design for the main survey, optimize for mean D-efficiency of the MNL model. The design will include 24 choice tasks per respondent and 6 blocks of choice cards (24 cards in total). Each respondent will receive 4 choice card. The utility function will be assumed that all attributes were dummy-coded, except for the cost, which was continuous. The priors for the efficient design will be based on the pilot study and generic across the three cities.
However, since I haven’t found similar studies, the pilot study can not use priors based on the result s of the earlier studies.

Therefore, I was designing a DCE pilot study in Ngene by using the D-error measure for finding an efficient design for the MNL model, I opted to include near - zero priors for all attributes and not a mixture of near-zero and literature priors.

Now. I have an unlabeled experiment experiment, with 2 alternatives + status quo (current situation).

- Three alternatives (two plus one status quo(sq))
- 4 attribute, with 4, 3, 4, 6 levels respectively.
- the sq is defined by the same attributes but the levels are fixed and gives a fixed utility

For two of the attributes sorting[1,2,4,7] (dummy) and collected[1,2,3] (dummy), I choose the lowest level 1 to be the level of the status que and the level 1 is also a level for alt1 and alt2.
For two other attributes point[0,1,2,3] (dummy) and cost[0,20,40,70,100,200], I choose the lowest level 0 to be the level of the status que, but the level for the status que only appears in the status que and not in the other two alternatives.
Below is my codes
Design
;alts = alt1*,alt2*,sq*
;rows = 24
;block=6,minsum
;eff = 2*(mnl, d) + 1*(imbalance)
;alg = mfederov(candidates=1000)
;require:
alt1.cost > 0, alt2.cost > 0, alt1.point > 0, alt2.point > 0,
sq.sorting = 1,
sq.collected = 1,
sq.point = 0,
sq.cost = 0
;model:
U(alt1) = b1.dummy[0.001|0.002|0.003] * sorting[1,2,4,7]
+ b2.dummy[0.001|0.002] * collected[1,2,3]
+ b3.dummy[0.001|0.003|0,002] * point[0,1,2,3]
+ b4[-.001] * cost[0,20,40,70,100,200]
/
U(alt2) = b1*sorting+b2*collected+b3*point+b4*cost
/
U(sq) =b1*sorting+b2*collected+b3*point+b4*cost
$

Could I ask some questions about my design?
First, do you think the parameter I choose is appropriate for the pilot study?
Second, the B-estimate is 73%, quite low, does there have some way to fix this?
Third, since I give same utility function for attribute sq, do I need to add constant?
Fourth, since I want each respondent who face 4 choice cards, do you think 24 rows and 6blocks good for the design?
Thank you for your help.
Best
Steve

by **Michiel Bliemer** » Mon Jun 20, 2022 11:31 am

To answer your four questions:

1) Yes it is fine to use noninformative (near-zero) priors in a pilot study where you only have knowledge about the preference ranking of the attribute levels to avoid dominant alternatives.

2) The B-estimate indicates utility balance and will be 100% when using zero priors. With non-zero priors, a good value is between 70-90% (since 0% would indicate dominant alternatives and 100% would mean equal utilities and hence random choices), but in my cases I would not look at the B-estimate at all. In your case, it is strange that you did not get near-100% utility balance and I found a mistake in your syntax, only you used a prior of 0,002 instead of 0.002. Ngene seems to interpret 0,002 as 2. Setting it to 0.002 will give near 100% utility balance.

3) Preferably you have a constant in the SQ alternative. However, in your case this constant is not identifiable because point=0 only appears in the SQ alternative and therefore the first level of the dummy coded variable is confounded with the alternative-specific constant. Only if you also allow point=0 in alt1 and alt2 you will be able to estimate the constant.

4) 24 rows should be sufficient, and blocking it in 6 is fine. Although I think that respondents may be able to respond to 6 choice tasks without an issue, which would collect 50% more data from respondents.

Other things to consider:
* I simplified the syntax somewhat.
* I added attribute level constraints to the cost attribute.
* There is usually no need to optimise for imbalance for dummy coded variables.

Further, note that the LAST level in Ngene is the base level in dummy coding, so perhaps b1.dummy[0.001|0.002|0.003] * sorting[1,2,4,7] should be b1.dummy[0.001|0.002|0.003] * sorting[2,4,7,1], if you mean to say that level 7 is better than level 4, which is better than level 2, which is better than level 1?

Below some suggestions for your syntax.

Code: Select all: Design ;alts = alt1*,alt2*,sq* ;rows = 24 ;block = 4,minsum ;eff = (mnl,d) ;alg = mfederov(candidates=1000) ;require: alt1.point > 0, alt2.point > 0, sq.sorting = 1, sq.collected = 1, sq.point = 0 ;model: U(alt1) = b1.dummy[0.001|0.002|0.003] * sorting[1,2,4,7] + b2.dummy[0.001|0.002] * collected[1,2,3] + b3.dummy[0.001|0.003|0.002] * point[0,1,2,3] + b4[-.001] * cost[20,40,70,100,200](4-6,4-6,4-6,4-6,4-6) / U(alt2) = b1*sorting+b2*collected+b3*point+b4*cost / U(sq) =b1*sorting+b2*collected+b3*point+b4*cost_sq[0] $

Michiel

by **Steven Guu** » Tue Jun 21, 2022 1:47 am

Dear Michiel

Thank you very much for your prompt and very helpful reponse.

When I run your syntax, I found Ngene is shown in the warning:
Warning: Two alternatives were specified for alternative repetition checking, but do not have the same attribute names, and so will not be checked. 'alt1', 'sq'
Warning: Two alternatives were specified for alternative repetition checking, but do not have the same attribute names, and so will not be checked. 'alt2', 'sq'
Do I need ignoring these warnings?

Secondly, do you think 16 rows is sufficient for the design, block4, in this case, people can keep facing 4 choice cards?

Thirdly, why you said that there is usually no need to optimise for imbalance for dummy coded variables? I must be missing some details from the manual?

Lastly, for all attributes
Sorting [1,2,4,7] level 7 is better than level 4, which is better than level 2, which is better than level 1.

Collected [1,2,3] level 3 is better than level 2, which is better than level 1,
Point [1,2,3] level 3 is better than level 2, which is better than level 1, which is better than level 0

Do I need to change all the attributes level following this?

U(alt1) = b1.dummy[0.001|0.002|0.003] * sorting[2,4,7,1]
+ b2.dummy[0.001|0.002] * collected[2,3,1]
+ b3.dummy[0.001|0.003|0.002] * point[1,2,3,0]
+ b4[-.001] * cost[20,40,70,100,200](4-6,4-6,4-6,4-6,4-6)

Thank you for your time and help.

Best Regards
Steve

by **Michiel Bliemer** » Tue Jun 21, 2022 10:02 am

- Yes you can ignore those warnings, there simply will not be repetition checking but there is still dominance checking, which is most important. Repetitions hardly ever occur.

- Yes 16 rows would be sufficient. I usually prefer a larger design for the sake of variation in the data, but 16 would be sufficient.

- The reason that you do not need to worry about optimising for attribute level balance for dummy coded variables is the following. Suppose that level 1 does not appear appear in the design, or only appears once in the design. Then the information captured for estimating the parameter for this level will be zero or very little, which means that the D-error will be very large or even infinite. When minimising the D-error it therefore automatically makes sure that each level is more or less equally represented. This is not the case for numerical variables such as cost where only a single parameter is estimated for each levels and it is optimal to mainly use the outer levels and not the inner levels. Therefore, it is important to impose level constraints on numerical variables but not so important to impose level constraints on categorical (dummy coded) variables.

- Yes you need to change the attribute levels top [2,4,7,1], [2,3,1], etc.

Michiel

by **Steven Guu** » Tue Jun 21, 2022 10:45 pm

Dear Michiel,

Thank you very much for your prompt and thorough explanation as always. I have learned a great deal from my correspondence with you.

Best Regards
Steven

by **Steven Guu** » Fri Jul 29, 2022 10:26 am

Dear Michiel

Thank you much for answering all my questions and also looking over my codes last time. I discussed my design with my colleagues, and they suggested that since I don’t know participants’ base levels for attribute 1, 2, I better to use I prefer none of these options for my SQ.

Now. I have an unlabeled experiment experiment, with 2 alternatives + status quo (I prefer none of these options).

- Three alternatives (two plus one status quo(sq))
- 4 attribute, with 4, 3, 3, 6 levels respectively.
- the sq is None

Below is my codes
Design
;alts = alt1*,alt2*,sq*
;rows = 16
;block = 4,minsum
;eff = (mnl,d)
;alg = mfederov(candidates=1000)

;model:
U(alt1) = b1.dummy[0.001|0.002|0.003] * sorting[2,4,7,1]
+ b2.dummy[0.001|0.002] * collected[2,3,1]
+ b3.dummy[0.001|0.003] * point[2,3,1]
+ b4[-.001] * cost[20,40,60,80,100,200](2-3,2-3,2-3,2-3,2-3,2-3)
/
U(alt2) = b1*sorting+b2*collected+b3*point+b4*cost

$

Could I ask some questions about my design?
First, when I run the syntax, I found that each block can not cover all the cost levels, does this matter if respondents cannot see all the cost levels when I analyse the data later ?
Second, if I include the sq utility function in my design, whether I need to show respondents all the information in SQ option which appeared in the SQ utility function?
Third, do you have another way to design SQ when I’m not sure participants’ base line if I want to include SQ utility function?
Thank you for your help.
Best
Steve

by **Michiel Bliemer** » Fri Jul 29, 2022 11:13 am

Note that the optout (none) alternative is NOT the same as a status quo alternative. The optout alternative needs a constant, or alt1 and alt2 need a constant. I suggest you add b0 to your utility functions of both alt1 and alt2 to denote the constant. If you also want to optimise the design for the constant, add ;con. If you use a status quo alternative, you need to show all the attribute levels of the status quo alternative in the survey. If you want a status quo altenative, you could simply create an efficient design using zero priors or an orthogonal design for alt1 and alt2 only. Then in the survey instrument you ask for the status quo levels of the respondent in the first part of the survey and then show these levels again later on in the choice tasks for this respondent. You can do this with dynamic referencing within survey instruments such as Qualtrics or SurveyEngine. Instead, you could also create library of designs where you pre-generate different designs for different segments of the population with different attribute levels.

You cannot guarantee attribute level balance within each block in an efficient design, this can only be guaranteed in an orthogonal design (but an orthogonal design cannot avoid dominant altenatives). Not having attribute level balance does not really affect the data analysis. It is usually considered "nice to have" in a design, but it cannot always be guaranteed.

Michiel

by **Steven Guu** » Fri Jul 29, 2022 12:29 pm

Dear Michiel

Thank you very much for your suggestions and I really appreciate your very helpful reponse.

When I run the syntax, I found Ngene is shown in the warning:
Warning: Two alternatives were specified for alternative dominance checking, but do not have the same attribute priors, and so will not be checked. 'alt1', 'none'
Warning: Two alternatives were specified for alternative dominance checking, but do not have the same attribute priors, and so will not be checked. 'alt2', 'none'
Warning: One or more attributes will not have level balance with the number of rows specified: alt1.collected, alt1.point, alt1.cost, alt2.collected, alt2.point, alt2.cost
Warning: The value of the prior ‘b0’ is specified again in the same model, the first specification will be used while the repeated one ignored.

Do I need ignoring these warnings?

This is my codes
Design
;alts = alt1*,alt2*,none*
;rows = 16
;block = 4,minsum
;eff = (mnl,d)
;con
;alg = mfederov(candidates=1000)

;model:
U(alt1) =b0[0]+ b1.dummy[0.001|0.002|0.003] * sorting[2,4,7,1]
+ b2.dummy[0.001|0.002] * collected[2,3,1]
+ b3.dummy[0.001|0.003] * point[2,3,1]
+ b4[-.001] * cost[20,40,60,80,100,200](2-3,2-3,2-3,2-3,2-3,2-3)
/
U(alt2) =b0[0]+ b1*sorting+b2*collected+b3*point+b4*cost

$

In addition, do I need to give same priors (near zero)for both b0, and does the codes above are correct or need more improvement ?

Thank you for your time and help.

Best Regards
Steve

by **Michiel Bliemer** » Fri Jul 29, 2022 2:02 pm

You can of course only apply dominance checks for unlabelled alternatives alt1 and alt2, so please remove the asterisk for none.
You have a 6-level attribute, so attribute level balance cannot be achieved with 16 rows. Adjust the number of rows if you want to have attribute level balance.

You can specify b0 in alt2, you do not need to repeat the prior value.

Michiel

by **Steven Guu** » Fri Jul 29, 2022 3:12 pm

Dear Michiel

Thank you very much for pointing out the problems of my code.
Three of my attributes alt collected 3 levels, alt point 3 levels and alt cost 6 levels will not have level balance, but the estimate is 99.4577. Do you think this design is enough to analyse the data for the pilot study.

Design
;alts = alt1*,alt2*,none
;rows = 16
;block = 4,minsum
;eff = (mnl,d)
;con
;alg = mfederov(candidates=1000)

;model:
U(alt1) =b0[0]+ b1.dummy[0.001|0.002|0.003] * sorting[2,4,7,1]
+ b2.dummy[0.001|0.002] * collected[2,3,1]
+ b3.dummy[0.001|0.003] * point[2,3,1]
+ b4[-.001] * cost[20,40,60,80,100,200](2-3,2-3,2-3,2-3,2-3,2-3)
/
U(alt2) =b0+ b1*sorting+b2*collected+b3*point+b4*cost

$

Thanks again.

Best Regards
Steve

choice-metrics.com

Pilot study design for bayesian efficient design

Pilot study design for bayesian efficient design

Re: Pilot study design for bayesian efficient design

Re: Pilot study design for bayesian efficient design

Re: Pilot study design for bayesian efficient design

Re: Pilot study design for bayesian efficient design

Re: Pilot study design for bayesian efficient design

Re: Pilot study design for bayesian efficient design

Re: Pilot study design for bayesian efficient design

Re: Pilot study design for bayesian efficient design

Re: Pilot study design for bayesian efficient design

Who is online