## Pilot study design for bayesian efficient design

This forum is for posts that specifically focus on Ngene.

Moderators: Andrew Collins, Michiel Bliemer, johnr

### Re: Pilot study design for bayesian efficient design

Dear Michiel

I also have another question about the none option in the survey. Could I write your own current situation in the none option or I only can put I prefer none of these in the none option?

Thanks again.

Best
Steve
Steven Guu

Posts: 13
Joined: Wed Jun 15, 2022 12:54 am

### Re: Pilot study design for bayesian efficient design

Ngene reports a UTILITY balance of 99%, which has nothing to do with ATTRIBUTE LEVEL balance. The design is essentially 100% utility balanced because you are using zero priors such that the choice probabilities become 33-33-33% across the three alternatives.

You do not need utility balance in the design, nor do you need attribute level balance in the design, to be able to estimate your model. If your D-errors are not infinite, then you can estimate the model. It should be fine for the pilot study.

Michiel
Michiel Bliemer

Posts: 1642
Joined: Tue Mar 31, 2009 4:13 pm

### Re: Pilot study design for bayesian efficient design

Dear Michiel

Thank you very much for advice. Sorry for keep asking these simple questions
Following your suggestions, to achieve attributes balance, I consider to use 12 rows, and block 3, in this case, people can keep facing 4 choice cards?
Do you think 16 rows is sufficient for the design?
In addition, if I also change blocks from 3 to 4, people will be shown 3 choice cards, which design do you think is better?
This is my codes
Design
;alts = alt1*,alt2*,none
;rows = 12
;block = 3,minsum
;eff = (mnl,d)
;con
;alg = mfederov(candidates=1000)

;model:
U(alt1) =b0+ b1.dummy[0.001|0.002|0.003] * sorting[2,4,7,1]
+ b2.dummy[0.001|0.002] * collected[2,3,1]
+ b3.dummy[0.001|0.003] * point[2,3,1]
+ b4[-.001] * cost[20,40,60,80,100,200](1-2,1-2,1-2,1-2,1-2,1-2)
/
U(alt2) =b0+ b1*sorting+b2*collected+b3*point+b4*cost

\$

Thank you very much for your time and help.

Best wishes
Steve
Steven Guu

Posts: 13
Joined: Wed Jun 15, 2022 12:54 am

### Re: Pilot study design for bayesian efficient design

If you ask me, I would use the syntax below. It uses the default swapping algorithm so it has full attribute level balance. It uses 24 rows so it has sufficient variation, and it is blocked in 4 because people can easily answer 6 choice tasks. But you could block in more parts if you like, but if you only show 3 choice tasks to a respondent you obviously will need double the sample size to capture the same amount of information.

Code: Select all
`Design;alts = alt1*,alt2*,none;rows = 24;block = 4,minsum;eff = (mnl,d);con;model:U(alt1) =b0+ b1.dummy[0.001|0.002|0.003] * sorting[2,4,7,1]+ b2.dummy[0.001|0.002] * collected[2,3,1]+ b3.dummy[0.001|0.003] * point[2,3,1]+ b4[-.001] * cost[20,40,60,80,100,200]/U(alt2) =b0+ b1*sorting+b2*collected+b3*point+b4*cost\$`

Michiel
Michiel Bliemer

Posts: 1642
Joined: Tue Mar 31, 2009 4:13 pm

### Re: Pilot study design for bayesian efficient design

Dear Michiel

Thank you very much for your prompt and thorough explanation. Clear now. I really appreciate your time and help.

Best regards
Steven
Steven Guu

Posts: 13
Joined: Wed Jun 15, 2022 12:54 am

### Re: Pilot study design for bayesian efficient design

Dear Michiel

Some months ago I asked you for advice regarding the pilot design for my research. I really appreciate your time and help. My data collection (150 respondents) went very well and all significant, I am very thankful for having been able to count on your expertise!

Now I’m going to design my main survey choice cards.

Since I have an unlabeled experiment experiment, with 2 alternatives + opt out choice.
- Three alternatives (two plus one none)
- 4 attribute, with 4, 3, 3, 6 levels respectively, which are method of sorting, collection, point and cost.
- none with no utility function

I tried to use the priors to construct a Bayesian efficient design.
Therefore, I create my design based on the following:
- Estimate an MNL model with ln (1) and without ln (1) (dummy code and effects code respectively) and obtain the parameter estimates and standard errors by using stata. Call this model1
- Specify a syntax for model_mnl in Ngene for generating a Bayesian efficient design for this MNL model using normal distributions with a mean set to the parameter estimate and a standard deviation set to the standard error
- Estimate an MMNL model with ln (1) and without ln (1) (dummy code and effects respectively) and obtain the parameter estimates (means and standard deviations and I estimate all normally distributed parameters). Call this model2
- Specify a syntax for model_rppanel in Ngene for evaluating an efficient design (with fixed priors) for this RPPANEL model using the parameter estimates.
- Optimise on model_mnl, and evaluate in model_rppanel

Then, I read one of the forum post and I follow your suggestion to run a mnl Bayesian model but to evaluate an efficient rppanel and I run the following syntax:

And I found I can get using effects code priors, I can get result, but using dummy code priors I get undefined results

The syntax will look something like:
1.Model 1: MNLln(1) (effects code) and Model 2: MMNLln (1) (effects code)

Design
;alts (model1)= alt1*, alt2*, none
;alts (model2)= alt1*, alt2*, none
;rows=24
;block=4,minsum
;eff = model1(mnl,d,mean)
;rdraws= gauss(3)
;bdraws= gauss(3)
;rep= 1000
;model(model1):
U(alt1)=b0 [2.1776]+b1.effects[(n,0.4817, 0.0982) |(n, 0.6718, 0.0911)|(n,0.0766,0.0801)]*sorting[2,4,7,1]+b2.effects[(n,0.2491, 0.1283)|(n,0.0986,0.2066)]*collected[2,3,1]+b3.effects[(n,0.1382, 0.0551)|(n,0.1964,0.0659)]*point[2,3,1]+b4[(n,-0.0011, 0.0009)]*cost[20,40,60,80,100,200]/
U(alt2) =b0+ b1*sorting+b2*collected+b3*point+b4*cost
;model(model2):
U(alt1)=b0 [2.6594]+b1.effects[n,0.9518, 1.1526 |n, 1.1794, 1.1526|n,0.1787,0.3335]*sorting[2,4,7,1]+b2.effects[n,0.3696, 0.5529|n,0.0850,0.9935]*collected[2,3,1]+b3.effects[n,0.1528, 0.2692|n,0.4018,0.4141]*point[2,3,1]+b4[n,-0.0048, 0.0192]*cost[20,40,60,80,100,200]/
U(alt2) =b0+ b1*sorting+b2*collected+b3*point+b4*cost
;alg=swap(stop=total(10mins))
\$

2.Model 1: MNL(effects) and Model 2: MMNL(effects )

Design
;alts (model1)= alt1*, alt2*, none
;alts (model2)= alt1*, alt2*, none
;rows=24
;block=4,minsum
;eff = model1(mnl,d,mean)
;rdraws= gauss(3)
;bdraws= gauss(3)
;rep= 1000
;model(model1):
U(alt1)=b0 [2.1776]+b1.effects[(n,0.4817, 0.0982) |(n, 0.6718, 0.0911)|(n,0.0766,0.0801)]*sorting[2,4,7,1]+b2.effects[(n,0.2491, 0.1283)|(n,0.0986,0.2066)]*collected[2,3,1]+b3.effects[(n,0.1382, 0.0551)|(n,0.1964,0.0659)]*point[2,3,1]+b4[(n,-0.0011, 0.0009)]*cost[20,40,60,80,100,200]/
U(alt2) =b0+ b1*sorting+b2*collected+b3*point+b4*cost

;model(model2):
U(alt1)=b0 [2.5666]+b1.effects[n,1.0219, 1.2793 |n, 1.2695, 1.1102|n,0.2339,0.5641]*sorting[2,4,7,1]+b2.effects[n,0.4472, 0.4141|n,0.2192,1.1468]*collected[2,3,1]+b3.effects[n,0.1577, 0.1206|n,0.3696,0.4963]*point[2,3,1]+b4[n,-0.0022, 0.0122]*cost[20,40,60,80,100,200]/
U(alt2) =b0+ b1*sorting+b2*collected+b3*point+b4*cost
;alg=swap(stop=total(10mins))
\$

3.Model 1: MNL(dummy) and Model 2: MMNL(dummy)

Design
;alts (model1)= alt1*, alt2*, none
;alts (model2)= alt1*, alt2*, none
;rows=24
;block=4,minsum
;eff = model1(mnl,d,mean)
;rdraws= gauss(3)
;bdraws= gauss(3)
;rep= 1000
;model(model1):
U(alt1)=b0 [0.6128]+b1.dummy[(n,1.7119, 0.1953) |(n, 1.9020, 0.1963)|(n,1.3068,0.1579)]*sorting[2,4,7,1]+b2.dummy[(n,0.2491, 0.1283)|(n,0.1505,0.1123)]*collected[2,3,1]+b3.dummy[(n,0.4728, 0.1089)|(n,0.5309,0.1257)]*point[2,3,1]+b4[(n,-0.0011, 0.0009)]*cost[20,40,60,80,100,200]/
U(alt2) =b0+ b1*sorting+b2*collected+b3*point+b4*cost

;model(model2):
U(alt1)=b0 [0.5318]+b1.dummy[n,2.2813, 1.1474 |n, 2.5509, 0.1700|n,1.7945,0.4958]*sorting[2,4,7,1]+b2.dummy[n,0.2876, 0.5755|n,-0.2078,0.8792]*collected[2,3,1]+b3.dummy[n,0.5745, 0.1680|n,0.6267,0.0042]*point[2,3,1]+b4[n,-0.0022, 0.0090]*cost[20,40,60,80,100,200]/
U(alt2) =b0+ b1*sorting+b2*collected+b3*point+b4*cost
;alg=swap(stop=total(10mins))
\$

4.Model 1: MNL ln(1)(dummy) and Model 2: MMNLln(1)(dummy)

Design
;alts (model1)= alt1*, alt2*, none
;alts (model2)= alt1*, alt2*, none
;rows=24
;block=4,minsum
;eff = model1(mnl,d,mean)
;rdraws= gauss(3)
;bdraws= gauss(3)
;rep= 1000
;model(model1):
U(alt1)=b0 [0.6128]+b1.dummy[(n,1.7119, 0.1953) |(n, 1.9020, 0.1963)|(n,1.3068,0.1579)]*sorting[2,4,7,1]+b2.dummy[(n,0.2491, 0.1283)|(n,0.1505,0.1123)]*collected[2,3,1]+b3.dummy[(n,0.4728, 0.1089)|(n,0.5309,0.1257)]*point[2,3,1]+b4[(n,-0.0011, 0.0009)]*cost[20,40,60,80,100,200]/
U(alt2) =b0+ b1*sorting+b2*collected+b3*point+b4*cost

;model(model2):
U(alt1)=b0 [0.7429]+b1.dummy[n,2.1945, 1.0682 |n, 2.4467, 0.1767|n,1.7065,0.4894]*sorting[2,4,7,1]+b2.dummy[n,0.2059, 0.5370|n,0.2604,0.8903]*collected[2,3,1]+b3.dummy[n,0.5844, 0.2577|n,0.6827,0.0183]*point[2,3,1]+b4[n,-0.0042, 0.0154]*cost[20,40,60,80,100,200]/
U(alt2) =b0+ b1*sorting+b2*collected+b3*point+b4*cost
;alg=swap(stop=total(10mins))
\$

However, since my alternative3 is none (opt out choice), it seems illogical to have a ASC in one of the alternatives 1,2. It should not have a constant there, the constant should be in alt3 to make sense.

Therefore, I tried to change the constants in b0 in alt12 to alt3 by using same mean but changed to negative.
5.Model 1: MNLln(1)(effects) and Model 2: MMNLln(1) (effects)

Design
;alts (model1)= alt1*, alt2*, none
;alts (model2)= alt1*, alt2*, none
;rows=24
;block=4,minsum
;eff = model1(mnl,d,mean)
;rdraws= gauss(3)
;bdraws= gauss(3)
;rep= 1000
;model(model1):
U(alt1)=b1.effects[(n,0.4817, 0.0982) |(n, 0.6718, 0.0911)|(n,0.0766,0.0801)]*sorting[2,4,7,1]+b2.effects[(n,0.2491, 0.1283)|(n,0.0986,0.2066)]*collected[2,3,1]+b3.effects[(n,0.1382, 0.0551)|(n,0.1964,0.0659)]*point[2,3,1]+b4[(n,-0.0011, 0.0009)]*cost[20,40,60,80,100,200]/
U(alt2) =b1*sorting+b2*collected+b3*point+b4*cost/
U(none)=b0 [-2.1776]
;model(model2):
U(alt1)=b1.effects[n,0.9518, 1.1526 |n, 1.1794, 1.1526|n,0.1787,0.3335]*sorting[2,4,7,1]+b2.effects[n,0.3696, 0.5529|n,0.0850,0.9935]*collected[2,3,1]+b3.effects[n,0.1528, 0.2692|n,0.4018,0.4141]*point[2,3,1]+b4[n,-0.0048, 0.0192]*cost[20,40,60,80,100,200]/
U(alt2) =b1*sorting+b2*collected+b3*point+b4*cost/
U(none)=b0 [-2.6594]
;alg=swap(stop=total(10mins))

\$

6.Model 1: MNL(effects) and Model 2: MMNL(effects )

Design
;alts (model1)= alt1*, alt2*, none
;alts (model2)= alt1*, alt2*, none
;rows=24
;block=4,minsum
;eff = model1(mnl,d,mean)
;rdraws= gauss(3)
;bdraws= gauss(3)
;rep= 1000
;model(model1):
U(alt1)=b1.effects[(n,0.4817, 0.0982) |(n, 0.6718, 0.0911)|(n,0.0766,0.0801)]*sorting[2,4,7,1]+b2.effects[(n,0.2491, 0.1283)|(n,0.0986,0.2066)]*collected[2,3,1]+b3.effects[(n,0.1382, 0.0551)|(n,0.1964,0.0659)]*point[2,3,1]+b4[(n,-0.0011, 0.0009)]*cost[20,40,60,80,100,200]/
U(alt2) =b1*sorting+b2*collected+b3*point+b4*cost/
U(alt2) =b0[-2.1776]

;model(model2):
U(alt1)=b1.effects[n,1.0219, 1.2793 |n, 1.2695, 1.1102|n,0.2339,0.5641]*sorting[2,4,7,1]+b2.effects[n,0.4472, 0.4141|n,0.2192,1.1468]*collected[2,3,1]+b3.effects[n,0.1577, 0.1206|n,0.3696,0.4963]*point[2,3,1]+b4[n,-0.0022, 0.0122]*cost[20,40,60,80,100,200]/
U(alt2) =b1*sorting+b2*collected+b3*point+b4*cost/
U(alt2) =b0[-2.5666]
;alg=swap(stop=total(10mins))
\$

If possible, could I ask some questions about my designs? the design 1 and design 5 are which I want pick one to use for my main survey, because the priors are significant and most reasonable. Other just for comparison.
First, does the way I changed the constant above was correct? If not, How can I delete the ASC in alt12 and add constant in alt3?

Secondly, How do I compare the efficiency between those models, and How do I compare MNL AND rp-panel results in one design? I haven’t found any from the manual?

Thirdly, do I need to add ;alg = mfederov(candidates = 1000) and delete ;alg=swap(stop=total(10mins)) to keep design balance?

Lastly, I read one of the forum post, and add ;alg=swap(stop=total(10mins)), I figured this just for quickly check the model which work or not, and I don’t know whether or not this is good for the design and why?

Best regards
Steven
Steven Guu

Posts: 13
Joined: Wed Jun 15, 2022 12:54 am

### Re: Pilot study design for bayesian efficient design

Putting the same constant in alt1 and alt2 or putting a constant in none with inverse sign is exactly the same model, so it does not matter where you put the constant.

You can only compare efficiency across designs for the same model with the same priors (e.g. an orthogonal design and an efficient design). You cannot compare efficiency across models with different specification, e.g. dummy versus effects coding or mnl versus rppanel.

The default swapping algorithm will try to maintain attribute level balance, mfederov will not.

I never use a stopping criterion and 10 minutes is usually not enough; I usually run scripts overnight when I have Bayesian priors and observe if the D-error has stabilised.

Note that you are using 3^8 = 6,561 draws for the random parameters, and for rppanel model the default is also a sample of 500 (via ;rep = 500), so the total number of computations Ngene is doing for evaluating the mmnl model is more than 3 mnl! It will take a long time for Ngene to evaluate the resulting design. I generally omit such evaluations but you can try.

Michiel
PS: try to keep the questions short as otherwise I may not be able to answer them Michiel Bliemer

Posts: 1642
Joined: Tue Mar 31, 2009 4:13 pm

### Re: Pilot study design for bayesian efficient design

Dear Michiel

Thank you very much for your prompt and very helpful reponse last time and I got very good results for the main survey.

Currently, I am designing a new DCE pilot study.The design will include 24 choice tasks per respondent and 4 blocks of choice cards (24 cards in total). Each respondent will receive 6 choice card. I am designing a DCE pilot study in Ngene by using the D-error measure for finding an efficient design for the MNL model, I opted to include near - zero priors for all attributes.

Now. I have an unlabeled experiment experiment, with 2 alternatives + opt out.

- Three alternatives (two plus none)
- 4 attribute, with 4, 3, 2, 5 levels respectively.

Design
;alts = alt1*,alt2*,none
;rows = 24
;block = 4,minsum
;eff = (mnl,d)
;con

;model:
U(alt1) =b0+ b1[0.002] * sonn[0,10,25,50] + b2.[0.001] * gas[10,20,30] + b3.dummy[0.001] * reem[2,1] + b4[-.001] * compensation[100,200,300,400,500]/
U(alt2) =b0+ b1*sonn+b2*gas+b3*reem+b4*compensation
\$

When I run above syntax, I found Ngene is shown that A valid random design could not be generated after approximately 10 seconds, in this time, of the 411110 attempts made, there were 0 row repetitions, 357015 alternative repetitions, and 56984 cases of dominance. there are a number of possible causes for this , including the specification of too many constraints, not having enough attributes levels for the number of rows required, and the use of too many scenario attributes.

Then, I changed the default swapping algorithm to ;alg=mfederov(candidates=5000) and the syntax works well.

Why I can not get the result by using default swapping algorithm, whether the problem because of too many continuous variables?
If I change to use ;alg=mfederov(candidates=5000) , do I need to worry about utility balance and dominance issues?
Thank you for your time and help.

Best regards
Steve
Steven Guu

Posts: 13
Joined: Wed Jun 15, 2022 12:54 am

### Re: Pilot study design for bayesian efficient design

The reason that the default swapping algorithm cannot find a feasible design is because of the dominance checks/constraints. You only have 4 attributes and the likelihood of a dominant alternative is high. The swapping algorithm is a column-based algorithm and it is likely that one of the 24 rows will have a dominant alternative. The mfederov algorithm can remove choice tasks with dominant alternatives in advance because it is a row-based algorithm.

When using (near-)zero priors, I generally recommend dummy coding all attributes as the most efficient design otherwise will only pair the extreme levels and the inner levels (because this is most efficient, but often not desirable). When dummy coding all attributes, you usually do not need to impose attribute level balance constraints because an efficient design will by definition need all levels to appear more or less equally to reduce the D-error.

My recommended script would be below. Usually the default of 2000 candidates is sufficient, with 5000 the algorithm takes quite long, but it is okay.

Code: Select all
`Design;alts = alt1*,alt2*,none;rows = 24;block = 4,minsum;eff = (mnl,d);alg = mfederov;con;model:U(alt1) = b0        + b1.dummy[0.01|0.02|0.03]          * sonn[10,25,50,0]         + b2.dummy[0.01|0.02]               * gas[20,30,10]         + b3.dummy[0.01]                    * reem[2,1]         + b4.dummy[-0.01|-0.02|-0.03|-0.04] * compensation[200,300,400,500,100]        /U(alt2) = b0         + b1 * sonn         + b2 * gas        + b3 * reem        + b4 * compensation\$`

Michiel

PS: Please create a new post for a new topic, you are now asking a question in an old and unrelated topic.
Michiel Bliemer

Posts: 1642
Joined: Tue Mar 31, 2009 4:13 pm

### Re: Pilot study design for bayesian efficient design

Dear Michiel

If I understand correctly, you suggest that I initially assume dummy coding for all attribute levels in a pilot study when using (near-)zero priors. I should then calculate the pilot results and utilize real priors for continuous variables (attributes 1, 2, and 4) and a dummy variable for attribute 3.

Or, do I need to consistently assume dummy coding for all attribute levels in the pilot study? In my research, I aim to investigate the coefficients for continuous variables (attributes 1, 2, and 4). I'm concerned whether this assumption of dummy coding for all attributes might impact the results we are seeking.

Thanks again for your time and help.

Best regards
Steve
Steven Guu

Posts: 13
Joined: Wed Jun 15, 2022 12:54 am

PreviousNext