choice-metrics.com

by **xiaoxiaoxiao** » Thu Oct 31, 2024 1:56 am

Hi ChoiceMetrics team,

I have a few questions regarding experiment design in Ngene:

1. While developing an efficient experiment design, I generated preliminary estimated choice probabilities for each alternative. After averaging across 36 situations, the probabilities for alt1, alt2, and alt3 are 41%, 35%, and 24%, respectively. However, when checking which alternative had the highest probability in each choice situation, I found that alt1 had the highest probability in 23 situations, alt2 in 11, and alt3 in only 2. This makes me question whether alt3 might be too weak and alt1 too dominant.

2. According to the Ngene documentation, B-error serves as an indicator of utility balance across alternatives, with a recommended range of 0.7 to 0.9. In my design, B-error values range from 0.049 to 0.998, with an average of 0.794. Should I consider removing scenarios with B-error values outside this range, even if doing so reduces the number of scenarios below the predefined 36?

3. Would using the mfederov method instead of swap help avoid dominance by one alternative? However, I noticed that mfederov may not support continuous attributes or mimic attribute levels between alternatives. Is there another approach that might achieve a better balance without these limitations?

4. My design involves 31 parameters in total. Is 36 scenarios sufficient to estimate these parameters reliably? Additionally, if I move to a Bayesian efficient design, how many parameters would be recommended to treat as random?

Thank you very much for your guidance!

by **Michiel Bliemer** » Thu Oct 31, 2024 4:33 pm

1. I would not worry about that too much. An optimal design does not have utility balance, it will always have that one alternatives is preferred more than the other. The percentages you indicate sound reasonably. If you have alternative-specific parameters, then you will likely need a larger sample size to estimate the parameters for alternative 3. You can only make alt3 more attractive by changing the range of the attribute levels, but that may not always be possible or desirable.

2. I would not remove choice tasks since the design was constructed such that it captures all relevant information across all choice tasks. Removing some choice tasks may lead to issues in capturing information about certain parameters and in the worst case you could end up with data with which you cannot estimate a model. Note that a choice task with probabilities 1%-49%-50% will produce a B-error close to 0, but this is still a useful choice task because it captures information about the trade-offs between alt2 and alt3.

3. Note that only strict dominance with unlabelled alternatives is problematic. A labelled alternative being more preferred than another is not really an issue and may simply be because the labels of the alternatives have a strong effect and that is just reality (e.g. car is more preferred than bus). The modified Federov algorithm can make the design more efficient, but will let go of attribute level balance (not utility balance) so there is a trade-off.

4. Yes that would be sufficient to estimate all parameters, but increasing the number of rows never hurts. I generally use 3*K/(J-1) as a rule of thumb for the minimum number of rows, where K is number of parameters, J is number of alternatives. So in your case that would be 3*31/2 = 47 rows as minimum, using a multiplier of 3 to get enough variation in the data to estimate later also perhaps some interaction effects. Using more rows is never a bad idea. But 36 will work. For a Bayesian efficient design, I would limit the number of random priors to 8-10, where you focus on the most important attributes and keep the other priors fixed. The most important attributes are the ones that have the largest contribution to utility, computed as coefficient * (maximum level - minimum level) in the case of a numerical attribute, or looking at the the largest positive/negative coefficient for a dummy coded attribute.

Michiel

by **xiaoxiaoxiao** » Wed Nov 06, 2024 8:09 am

Hi Michiel,

Thank you for your guidance! Below is the script for my experiment, where I am generating a pivot efficient design. I still have a few questions:

1. attrf is the a reference attribute ranging from 15 to 60. I use pvattrf1 to pvattrf4 as pivot factors for this attribute. For alternative 2 and 3, attrf is multiplied by different ratios to represent two other attributes. For alternative 1, attrf is multiplied by co[0.55] to represent attribute "co". This results in multiple interactions within each utility function. Is this setup correct?

2. The attribute "another" only has one level [4]. However, it’s essential to the utility function, so I can’t remove it. Is it correct to include it in this way?

3. The context attributes from context1 to context4 should have the same level across all alternatives, so I’m using the ;require function. However, I encountered the following error:
"Error: The modified Federov candidate set size of 2000 could not be achieved. The percentages of candidates that failed are: 0% due dominance, 100% due constraints, and 0% due repeated alternatives. The candidate set size has been adjusted from 2000 to 0.
[Modified Fedorov] ERROR: The candidate set of the Modified Fedorov algorithm is smaller than the number of rows specified. That is, there are not enough unique choice sets to generate the number required as specificed in the ;rows property. This problem sometimes appears when there are too many reject and/or reject constraints."
Could you please advise on how to resolve this issue?

4. I defined the marginal utility of context2 as b209.dummy[-0.2|-0.1] * context2[1,2,0]. Does this mean the third level is the reference, with prior parameters of -0.2 and -0.1 for the first and second levels?

5. I have 31 parameters to estimate and a total of 48 choice tasks. Could you suggest a recommended sample size?

Thank you very much for reviewing and helping with these questions. I look forward to your reply.

Code: Select all: Design ;alts = alt1, alt2, alt3 ;rows = 48 ;eff = (mnl,d) ;alg = mfederov ;require: alt2.context1 = alt3.context1 , alt2.context2 = alt3.context2 , alt2.context3 = alt3.context3 , alt2.context4 = alt3.context4 , alt1.attrf = alt2.attrf , alt1.attrf = alt3.attrf , alt2.att6 = alt3.att6 ;model: U(alt1) = b101[-0.8] * pvattrf1[0.8,1.2,1.5] * attrf[15:60:1] + b104[-0.2] * pvatt3[0.5,1.1,1.7] * att3[15] + b105[-0.2] * pvco[0.3,0.6,0.9] * attrf * co[0.55] / U(alt2) = b200[-0.3] + b201[-0.8] * pvattrf2[0.8,1.1,1.4] * attrf * ratio1[0.5] + b202[-0.6] * pvattrf3[0.9,1.1,1.2] * attrf * ratio2[0.6] + b205[-0.8] * att6[5,10] + b105 * pvco[0.3,0.6,0.9] * attrf * co * ratio1 + b105 * another[4] + b208.dummy[-0.1] * context1[1,0] + b209.dummy[-0.2|-0.1] * context2[1,2,0] + b210.dummy[-0.1] * context3[1,0] + b211.dummy[-0.2|-0.1] * context4[1,2,0] / U(alt3) = b300[-0.9] + b301[-0.6] * pvattrf4[0.8,1.1,1.4] * attrf * ratio3[0.7] + b302[-0.6] * pvattrf3[0.9,1.1,1.2] * attrf * ratio2 + b308[-0.8] * att6 + b105 * another[4] + b310.dummy[-0.1] * context1 + b311.dummy[-0.2|-0.1] * context2 + b312.dummy[-0.1] * context3 + b313.dummy[-0.2|-0.1] * context4 $

by **Michiel Bliemer** » Wed Nov 06, 2024 8:12 pm

1. Yes I think so. But note that your priors are too large, multiplying a parameter value of -0.8 with an attribute level as large as 60 yields an enormous utility. You should take prior values from a pilot study or otherwise choose values very close to 0 as otherwise you may obtain a very inefficient design.

2. Yes but only because you are using parameter b105 also for another interaction effect, as otherwise you would be simply adding a constant.

3. The mfederov algorithm creates by default a candidate set of 2000 rows. But given that you have a very large number of levels for attrf, it is unlikely that there will exist choice tasks in this candidate set that satisfy your constraints. You could increase the number of candidates but this will likely need to be very large and impractical. A better option is to use the default swapping algorithm and create scenario constraints directly in the utility function, see script below.

4. Yes

5. Ngene produces sample size estimates in the output based on the priors that you provided. If your priors are not reliable (i.e. coming from a pilot study) then you should likely ignore the sample size estimates. There is no other way to determine the sample size you need as it is case specific.

Code: Select all: Design ;alts = alt1, alt2, alt3 ;rows = 48 ;eff = (mnl,d) ;model: U(alt1) = b101[-0.08] * pvattrf1[0.8,1.2,1.5] * attrf[15:60:1] + b104[-0.02] * pvatt3[0.5,1.1,1.7] * att3[15] + b105[-0.02] * pvco[0.3,0.6,0.9] * attrf * co[0.55] / U(alt2) = b200[-0.3] + b201[-0.08] * pvattrf2[0.8,1.1,1.4] * attrf[attrf] * ratio1[0.5] + b202[-0.06] * pvattrf3[0.9,1.1,1.2] * attrf[attrf] * ratio2[0.6] + b205[-0.08] * att6[5,10] + b105 * pvco[0.3,0.6,0.9] * attrf * co * ratio1 + b105 * another[4] + b208.dummy[-0.1] * context1[1,0] + b209.dummy[-0.2|-0.1] * context2[1,2,0] + b210.dummy[-0.1] * context3[1,0] + b211.dummy[-0.2|-0.1] * context4[1,2,0] / U(alt3) = b300[-0.9] + b301[-0.06] * pvattrf4[0.8,1.1,1.4] * attrf[attrf] * ratio3[0.7] + b302[-0.06] * pvattrf3[0.9,1.1,1.2] * attrf[attrf] * ratio2 + b308[-0.08] * att6[att6] + b105 * another[4] + b310.dummy[-0.1] * context1[context1] + b311.dummy[-0.2|-0.1] * context2[context2] + b312.dummy[-0.1] * context3[context3] + b313.dummy[-0.2|-0.1] * context4[context4] $

Michiel

by **xiaoxiaoxiao** » Thu Nov 07, 2024 1:14 am

Hi Michiel,

Thank you very much for your reply, which helps me a lot.

Regarding your second comment, b105 represents the cost parameter.

In alt1, the cost levels are defined as "pvco[0.3,0.6,0.9] * attrf * co[0.55]".
In alt2, the cost contains two parts: "pvco[0.3,0.6,0.9] * attrf * co * ratio1" and a constant term "another[4]". In other words, if i write down the cost of alt2 in a mathematical expression, it will be "pvco[0.3,0.6,0.9] * attrf * co * ratio1 + 4". Therefore, I write down the marginal utility of cost as "b105 * pvco[0.3,0.6,0.9] * attrf * co * ratio1 + b105 * another[4]".
In alt3, cost is constant and represented solely as "another[4]".

Since "another[4]" has no multiplied interactions with other attributes, should I include it as "b105 * another[4]" in alt2 and alt3 (making it part of the cost in alt2 and the entire cost in alt3), or would it be better to remove this attribute from both?

Thank you for your consideration!

Best regards,
Xiao

by **Michiel Bliemer** » Thu Nov 07, 2024 1:31 am

I missed that you also used b105 in alt1. Then it is fine to include it as b105 * another[4] in the other utilities to state that it has a fixed cost of 4.

choice-metrics.com

Questions on utility balance in Ngene

Questions on utility balance in Ngene

Re: Questions on utility balance in Ngene

Re: Questions on utility balance in Ngene

Re: Questions on utility balance in Ngene

Re: Questions on utility balance in Ngene

Re: Questions on utility balance in Ngene

Who is online