choice-metrics.com

by **Andrew** » Thu Sep 12, 2024 8:18 pm

Hello,

I am a fan of the software, but I only need it every few months to create experimental designs. Therefore, I occasionally have to ask for help again.

For a research study, we would like to conduct a DCE (as Best-Worst Scaling Case 3). Our client wants us to keep the option open to analyze all possible 2-way interaction effects. We have a total of 3 attributes with 8, 6, and 6 levels. Four unlabeled alternatives are to be used. When calculating the minimum required size, I arrive at the following:

Attr1: 8-1 = 7
Attr2: 6-1 = 5
Attr3: 6-1 = 5
Attr1xAttr2: (8-1)*(6-1) = 35
Attr1xAttr3: (8-1)*(6-1) = 35
Attr2xAttr3: (6-1)*(6-1) = 25
Constant: 3 (to account for left-right bias)
Total = 115 * 3 = 345 rows

Here is our experimental design:

Code: Select all: design ;alts = opt1, opt2, opt3, opt4 ;eff = (mnl, d, mean) ;rows = 350 ;block = 25 ;bdraws = halton(500) ;model: U(opt1) = b0 + b1.effects[(n,-0.5,0.1)|(n,-0.4,0.1)|(n,-0.4,0.1)|(n,-0.2,0.1)|(n,0.5,0.1)|(n,0.4,0.1)|(n,0.3,0.1)] * region[0,1,2,3,4,5,6,7] + b2.effects[(n,0.4,0.1)|(n,0.3,0.1)|(n,0.2,0.1)|(n,-0.2,0.1)|(n,-0.3,0.1)] * risk[0,1,2,3,4,5] + b3.effects[(n,0.4,0.1)|(n,0.3,0.1)|(n,0.2,0.1)|(n,-0.2,0.1)|(n,-0.3,0.1)] * incentive[0,1,2,3,4,5] / U(opt2) = b0 + b1 * region + b2 * risk + b3 * incentive / U(opt3) = b0 + b1 * region + b2 * risk + b3 * incentive / U(opt4) = b1 * region + b2 * risk + b3 * incentive $

We have specified 350 rows in 25 blocks. We are using effect coding. The priors are based on good guesses. Unfortunately, we only have a sample size of 150 participants. Therefore, we did not conduct a pilot study to collect priors. It is possible that, especially for the attribute "region," the priors may not reflect reality. At first glance, the results and design properties look quite good.

Our questions:

1) Is the design good (efficient) enough to analyze all main and all possible 2-way interaction effects?

2) How much could the relatively small sample size of 150 affect the estimation accuracy?

3) In the design, b0 is specified. The constant is only needed to test for possible left-right bias. Does the constant still need to be considered in the utility function when creating the design? Does including the constant affect the efficiency of the design?

4) Since we did not conduct a pilot study and the priors are based on 'good guesses' (especially for the 'Region' attribute), how sensitive is the design to possible incorrect assumptions about the priors? How can I minimize the risk of incorrect priors?

5) What could be changed or improved in the syntax?

I look forward to any feedback. Thank you very much!

Andrew

by **Michiel Bliemer** » Thu Sep 12, 2024 9:00 pm

1) Maybe, but since you did not include the interaction effects in your utility you cannot guarantee that all interaction effects can be estimated. If you want to make sure of this, you will need to add all interaction effects explicitly into the utility function, such as in the example below where I added region * risk interactions. But to be honest, in this model the number of interaction parameters would greatly outnumber the number of main effects parameters, so this would make the design mostly efficient for interaction effects and less efficient for main effects. To be honest, I think the number of interaction effects is far too great and you may need to accept the risk of not being able to estimate all interaction effects. See also my next response.

2) It is unlikely that you will be able to estimate statistically significant interaction effects with only 150 respondents. You will need a huge sample size to estimate 115 parameters. But hopefully all main effects will come out statistically significant. I would focus on the main effects only if you have such a small sample size, and if you are lucky you will be able to obtain SOME statistically significant interaction effects.

3) No, you do not optimise for estimating any left-to-right bias coefficients in your experimental design, so just remove b0. In model estimation, you will add 3 DIFFERENT constants.

4) Only (near) zero priors are risk free. Using a Bayesian efficient design as you do here also mitigates the risk to some extent, but if the priors are really off then your design could become inefficient.

5) If there is any chance of dominance, then you need to add an * to the alternatives in the ;alts command, and also use ;alg = mfederov

Code: Select all: design ;alts = opt1, opt2, opt3, opt4 ;eff = (mnl, d, mean) ;rows = 350 ;block = 25 ;bdraws = halton(500) ;model: U(opt1) = b1.effects[(n,-0.5,0.1)|(n,-0.4,0.1)|(n,-0.4,0.1)|(n,-0.2,0.1)|(n,0.5,0.1)|(n,0.4,0.1)|(n,0.3,0.1)] * region[0,1,2,3,4,5,6,7] + b2.effects[(n,0.4,0.1)|(n,0.3,0.1)|(n,0.2,0.1)|(n,-0.2,0.1)|(n,-0.3,0.1)] * risk[0,1,2,3,4,5] + b3.effects[(n,0.4,0.1)|(n,0.3,0.1)|(n,0.2,0.1)|(n,-0.2,0.1)|(n,-0.3,0.1)] * incentive[0,1,2,3,4,5] + i00 * region.effects[0] * risk.effects[0] + i01 * region.effects[0] * risk.effects[1] + i02 * region.effects[0] * risk.effects[2] + i03 * region.effects[0] * risk.effects[3] + i04 * region.effects[0] * risk.effects[4] + i10 * region.effects[1] * risk.effects[0] + i11 * region.effects[1] * risk.effects[1] + i12 * region.effects[1] * risk.effects[2] + i13 * region.effects[1] * risk.effects[3] + i14 * region.effects[1] * risk.effects[4] + i20 * region.effects[2] * risk.effects[0] + i21 * region.effects[2] * risk.effects[1] + i22 * region.effects[2] * risk.effects[2] + i23 * region.effects[2] * risk.effects[3] + i24 * region.effects[2] * risk.effects[4] + i30 * region.effects[3] * risk.effects[0] + i31 * region.effects[3] * risk.effects[1] + i32 * region.effects[3] * risk.effects[2] + i33 * region.effects[3] * risk.effects[3] + i34 * region.effects[3] * risk.effects[4] + i40 * region.effects[4] * risk.effects[0] + i41 * region.effects[4] * risk.effects[1] + i42 * region.effects[4] * risk.effects[2] + i43 * region.effects[4] * risk.effects[3] + i44 * region.effects[4] * risk.effects[4] + i50 * region.effects[5] * risk.effects[0] + i51 * region.effects[5] * risk.effects[1] + i52 * region.effects[5] * risk.effects[2] + i53 * region.effects[5] * risk.effects[3] + i54 * region.effects[5] * risk.effects[4] + i60 * region.effects[6] * risk.effects[0] + i61 * region.effects[6] * risk.effects[1] + i62 * region.effects[6] * risk.effects[2] + i63 * region.effects[6] * risk.effects[3] + i64 * region.effects[6] * risk.effects[4] / U(opt2) = b1 * region + b2 * risk + b3 * incentive + i00 * region.effects[0] * risk.effects[0] + i01 * region.effects[0] * risk.effects[1] + i02 * region.effects[0] * risk.effects[2] + i03 * region.effects[0] * risk.effects[3] + i04 * region.effects[0] * risk.effects[4] + i10 * region.effects[1] * risk.effects[0] + i11 * region.effects[1] * risk.effects[1] + i12 * region.effects[1] * risk.effects[2] + i13 * region.effects[1] * risk.effects[3] + i14 * region.effects[1] * risk.effects[4] + i20 * region.effects[2] * risk.effects[0] + i21 * region.effects[2] * risk.effects[1] + i22 * region.effects[2] * risk.effects[2] + i23 * region.effects[2] * risk.effects[3] + i24 * region.effects[2] * risk.effects[4] + i30 * region.effects[3] * risk.effects[0] + i31 * region.effects[3] * risk.effects[1] + i32 * region.effects[3] * risk.effects[2] + i33 * region.effects[3] * risk.effects[3] + i34 * region.effects[3] * risk.effects[4] + i40 * region.effects[4] * risk.effects[0] + i41 * region.effects[4] * risk.effects[1] + i42 * region.effects[4] * risk.effects[2] + i43 * region.effects[4] * risk.effects[3] + i44 * region.effects[4] * risk.effects[4] + i50 * region.effects[5] * risk.effects[0] + i51 * region.effects[5] * risk.effects[1] + i52 * region.effects[5] * risk.effects[2] + i53 * region.effects[5] * risk.effects[3] + i54 * region.effects[5] * risk.effects[4] + i60 * region.effects[6] * risk.effects[0] + i61 * region.effects[6] * risk.effects[1] + i62 * region.effects[6] * risk.effects[2] + i63 * region.effects[6] * risk.effects[3] + i64 * region.effects[6] * risk.effects[4] / U(opt3) = b1 * region + b2 * risk + b3 * incentive + i00 * region.effects[0] * risk.effects[0] + i01 * region.effects[0] * risk.effects[1] + i02 * region.effects[0] * risk.effects[2] + i03 * region.effects[0] * risk.effects[3] + i04 * region.effects[0] * risk.effects[4] + i10 * region.effects[1] * risk.effects[0] + i11 * region.effects[1] * risk.effects[1] + i12 * region.effects[1] * risk.effects[2] + i13 * region.effects[1] * risk.effects[3] + i14 * region.effects[1] * risk.effects[4] + i20 * region.effects[2] * risk.effects[0] + i21 * region.effects[2] * risk.effects[1] + i22 * region.effects[2] * risk.effects[2] + i23 * region.effects[2] * risk.effects[3] + i24 * region.effects[2] * risk.effects[4] + i30 * region.effects[3] * risk.effects[0] + i31 * region.effects[3] * risk.effects[1] + i32 * region.effects[3] * risk.effects[2] + i33 * region.effects[3] * risk.effects[3] + i34 * region.effects[3] * risk.effects[4] + i40 * region.effects[4] * risk.effects[0] + i41 * region.effects[4] * risk.effects[1] + i42 * region.effects[4] * risk.effects[2] + i43 * region.effects[4] * risk.effects[3] + i44 * region.effects[4] * risk.effects[4] + i50 * region.effects[5] * risk.effects[0] + i51 * region.effects[5] * risk.effects[1] + i52 * region.effects[5] * risk.effects[2] + i53 * region.effects[5] * risk.effects[3] + i54 * region.effects[5] * risk.effects[4] + i60 * region.effects[6] * risk.effects[0] + i61 * region.effects[6] * risk.effects[1] + i62 * region.effects[6] * risk.effects[2] + i63 * region.effects[6] * risk.effects[3] + i64 * region.effects[6] * risk.effects[4] / U(opt4) = b1 * region + b2 * risk + b3 * incentive + i00 * region.effects[0] * risk.effects[0] + i01 * region.effects[0] * risk.effects[1] + i02 * region.effects[0] * risk.effects[2] + i03 * region.effects[0] * risk.effects[3] + i04 * region.effects[0] * risk.effects[4] + i10 * region.effects[1] * risk.effects[0] + i11 * region.effects[1] * risk.effects[1] + i12 * region.effects[1] * risk.effects[2] + i13 * region.effects[1] * risk.effects[3] + i14 * region.effects[1] * risk.effects[4] + i20 * region.effects[2] * risk.effects[0] + i21 * region.effects[2] * risk.effects[1] + i22 * region.effects[2] * risk.effects[2] + i23 * region.effects[2] * risk.effects[3] + i24 * region.effects[2] * risk.effects[4] + i30 * region.effects[3] * risk.effects[0] + i31 * region.effects[3] * risk.effects[1] + i32 * region.effects[3] * risk.effects[2] + i33 * region.effects[3] * risk.effects[3] + i34 * region.effects[3] * risk.effects[4] + i40 * region.effects[4] * risk.effects[0] + i41 * region.effects[4] * risk.effects[1] + i42 * region.effects[4] * risk.effects[2] + i43 * region.effects[4] * risk.effects[3] + i44 * region.effects[4] * risk.effects[4] + i50 * region.effects[5] * risk.effects[0] + i51 * region.effects[5] * risk.effects[1] + i52 * region.effects[5] * risk.effects[2] + i53 * region.effects[5] * risk.effects[3] + i54 * region.effects[5] * risk.effects[4] + i60 * region.effects[6] * risk.effects[0] + i61 * region.effects[6] * risk.effects[1] + i62 * region.effects[6] * risk.effects[2] + i63 * region.effects[6] * risk.effects[3] + i64 * region.effects[6] * risk.effects[4] $

by **Andrew** » Thu Sep 12, 2024 11:08 pm

Michiel, thank you very much. This was very helpful and eye-opening. Given our limited sample size of 150 participants, we are inclined to focus on estimating main effects and possibly some key interaction effects. We will need to discuss the possibility with our client, especially to see if they have specific hypotheses in mind.

Also, since we do not know about the heterogeneity of choice decisions in the sample, we are considering using a design without priors. What alternative design would be suitable? Given the unequal number of levels across all attributes (8, 6, 6), would an OOD still be feasible? If not, what other design strategies would you recommend that do not rely heavily on priors? The aim is still to be able to analyze the data with effect coding.

by **Andrew** » Fri Sep 20, 2024 7:58 pm

We have removed the interaction effects from the design. Estimating the main effects is more important. Nevertheless, we would use a large number of rows to cover many level combinations. Perhaps we will find indications of possible interaction effects (or perhaps just artifacts). We came up with the deisgn shown below. Divided into 30 blocks, this a good level balance. Ngene repeatedly claims “A valid initial random design could not be generated after approximately 10 seconds.”, but can then find any number of designs. We would use the design for a pilot test with n=15. The idea is that if the data is already good, we would continue with this design for the main study. I doubt that even after the pilot we will get clear priors, especially for the first categorical attribute. Does this approach seem reasonable? I am not completely sure if we can (or should) use the efficient design without priors.

Code: Select all: design ;alts = opt1*, opt2*, opt3*, opt4* ;block = 30 ;eff = (mnl, d) ;alg = swap ;rows = 420 ;model: U(opt1) = b1.dummy[0|0|0|0|0|0|0] * region[0,1,2,3,4,5,6,7] + b2.dummy[0|0|0|0|0] * risk[0,1,2,3,4,5] + b3.dummy[0|0|0|0|0] * incentive[0,1,2,3,4,5] / U(opt2) = b1 * region + b2 * risk + b3 * incentive / U(opt3) = b1 * region + b2 * risk + b3 * incentive / U(opt4) = b1 * region + b2 * risk + b3 * incentive $

by **Michiel Bliemer** » Sat Sep 21, 2024 8:36 am

The swapping algorithm would not be able to find a feasible design because it is highly constrained; since you only have 3 attributes, it is very likely that the same combination of one of these attributes appears in multiple alternatives. The solution is to switch to the Modified Federov algorithm:

;alg = mfederov

Using zero priors is not a problem, it is similar to using an orthogonal design (which also implicitly assumes zero priors) but with the ability to remove dominant alternatives etc. If your attributes have a clear preference order, then you may want to use small near-zero priors to indicate this order, such that Ngene can automatically avoid dominant alternatives.

If dominant alternatives are not an issue, you could also consider using an orthogonal design with foldover, which has 2*144 = 288 rows.
;orth = ood
;foldover

The foldover doubles the choice tasks in the orthogonal design and ensures that all two-way interaction effects have low (zero or near-zero) correlations with the main effects. This is especially useful if you may want to estimate interaction effects later.

Michiel

by **Andrew** » Mon Sep 23, 2024 6:28 pm

Thank you very much again!

We cannot establish a clear preference order for risk and incentives, as there appear to be dependencies on the region/field. Without a pilot, this will be hard to determine. I will test and compare the efficient design (using the Modified Federov algorithm) and the OOD foldover in a simulation.

I have one more question about the foldover design. With the foldover property, a foldover block is added to the design consisting of the values 1 and 2. How can I add more blocks to the design? If I also add the block property to the syntax, only the first foldover block is divided into blocks. Only zeros are assigned to the second foldover. Can I add this manually? I couldn't find anything in the Ngene manual about whether foldover and block properties can be used together. However, we could also do without blocking and randomly assign the choice tasks to the respondents. We have no experience with this so far. But we can implement this in practice using SurveyEngine.

by **Michiel Bliemer** » Tue Sep 24, 2024 11:32 am

I would simply randomly assign choice tasks. It will not be possible to use orthogonal blocking with such a large design and other blocking strategies are not perfect.

Michiel

by **Andrew** » Tue Sep 24, 2024 5:46 pm

Thank you very much! This was a great support and took me a big step forward.

Andrew

by **Andrew** » Wed Oct 09, 2024 6:39 pm

Dear Michiel, I have a follow-up question: I happened to stumble upon a comment in the forum where you wrote that there was a bug that only occurs in some cases with 5 or more levels and only affects the D-Optimality message. In our final design with more than 5 levels, “Undefined” is displayed in the output window under OOD D-Optimality. The D error for the design is 0.037378. Is our OOD design still valid? We are using the latest version 1.4.0 (Build: 24011)

Code: Select all: design ;alts = opt1*, opt2*, opt3*, opt4* ;orth = ood ;rows = 144 ;foldover ;model: U(opt1) = b1.dummy[0|0|0|0|0|0|0] * x1[0,1,2,3,4,5,6,7] + b2.dummy[0|0|0|0|0] * x2[0,1,2,3,4,5] + b3.dummy[0|0|0|0|0] * x3[0,1,2,3,4,5] / U(opt2) = b1 * x1 + b2 * x2 + b3 * x3 / U(opt3) = b1 * x1 + b2 * x2 + b3 * x3 / U(opt4) = b1 * x1 + b2 * x2 + b3 * x3 $

by **Michiel Bliemer** » Thu Oct 10, 2024 11:14 am

Just ignore the Undefined D-optimality, it not a useful measure anyway. In the next version of Ngene (2.0) we will omit this information since it is misleading. The finite D-error indicates that this design is fine so you can use it. Note that the asterisk (*) in the alts property does not do anything in orthogonal designs as it cannot avoid dominant alternatives.

choice-metrics.com

Experimental design for all 2-way interactions

Experimental design for all 2-way interactions

Re: Experimental design for all 2-way interactions

Re: Experimental design for all 2-way interactions

Re: Experimental design for all 2-way interactions

Re: Experimental design for all 2-way interactions

Re: Experimental design for all 2-way interactions

Re: Experimental design for all 2-way interactions

Re: Experimental design for all 2-way interactions

Re: Experimental design for all 2-way interactions

Re: Experimental design for all 2-way interactions

Who is online