Page 1 of 1

Labelled vs Unlabelled design, constraints and interactions

PostPosted: Tue Jan 05, 2021 10:40 am
by Michail
Dear Ngene team and all,

First, I seize the opportunity to wish you happy new year full of health and happiness.

I am beginner in Ngene and although I have been reading the software’s manual and following discussions in in the forum, I apologise in advance for any silly questions and mistakes.

I am working on the generation of an efficient design, where each choice task will have two alternatives plus an opt-out option. The attributes to be included in the CE are six, out of which one (price) is quantitative and the remaining 5 are qualitative. One attribute has 6 levels, three attributes have 3 levels and two attributes have 2 levels.

The problem I face is that the definition of an attribute and consequently its levels depend on another attribute and its levels (similar to transportation mode and travel time case, I believe). More precisely, say attribute M has two levels (m1, m2). Now, attribute S is defined slightly different when attribute M takes the value m1 or m2.

I am not sure whether the CE would be best to be labelled or unlabelled (in terms of Ngene design) and am working on generating both of them. I am aware of the pros and cons for each type of design.
In the unlabelled CE I treated S attribute as having 6 levels (3 levels when attribute M takes the value of m1 and 3 levels when attribute M takes the value of m2) and introduced constraints. In the labelled CE I treated S attribute as 2 separate attributes (WS and FS) specific for each alternative and so each of them has 3 levels and did not introduce constraints.

Is my reasoning correct or it is far away from Ngene logic? The two designs can be seen below.

Code: Select all
? Labelled Experiment
Design
;alts = W, F, none
;rows = 36
;eff = (mnl, d)
;
con:
;
block = 6
;model:
U(W) =  bw[0] +
        b2[0] * C[1,2,3,4,5,6] +
        b3.dummy[0] * N[1,2] +
        b4.dummy[0|0] * R[1,2,3] +
        b5.dummy[0|0] * T[1,2,3] +
        b6.dummy[0|0] * WS[1,2,3] +
        b8[0] * N.dummy[1]*T.dummy[1] +
        b9[0] * N.dummy[1]*T.dummy[2] +
        b10[0] * T.dummy[1]*WS.dummy[1] +
        b11[0] * T.dummy[1]*WS.dummy[2] +
        b12[0] * T.dummy[2]*WS.dummy[1] +
        b13[0] * T.dummy[2]*WS.dummy[2] +
        b14[0] * N.dummy[1]*WS.dummy[1] +
        b15[0] * N.dummy[1]*WS.dummy[2] /
U(F) =  bf[0] +
        b2 * C[1,2,3,4,5,6] +
        b3.dummy * N[1,2] +
        b4.dummy * R[1,2,3] +
        b5.dummy * T[1,2,3] +
        b7.dummy[0|0] * FS[1,2,3] +
        b8[0] * N.dummy[1]*T.dummy[1] +
        b9[0] * N.dummy[1]*T.dummy[2] +
        b16[0] * T.dummy[1]*FS.dummy[1] +
        b17[0] * T.dummy[1]*FS.dummy[2] +
        b18[0] * T.dummy[2]*FS.dummy[1] +
        b19[0] * T.dummy[2]*FS.dummy[2] +
        b20[0] * N.dummy[1]*FS.dummy[1] +
        b21[0] * N.dummy[1]*FS.dummy[2] 



Code: Select all
? Unlabelled Experiment
Design
;alts = alt1*, alt2*, none
;rows = 48
;eff = (mnl, d)
;
cond:
if(
alt1.M=1, alt1.S=[1,2,3]),   
if(alt1.M=2, alt1.S=[4,5,6]),
if(
alt2.M=1, alt2.S=[1,2,3]),   
if(alt2.M=2, alt2.S=[4,5,6]),
if(
alt1.M=1 and alt2.M=1, alt1.S=[1,2,3] and alt2.S=[1,2,3]),
if(
alt1.M=2 and alt2.M=2, alt1.S=[4,5,6] and alt2.S=[4,5,6])
;
block = 6
;model:
U(alt1) = bA[0] +
          b2[0] * C[1,2,3,4,5,6] +
          b3.dummy[0] * N[1,2] +
          b4.dummy[0|0] * R[1,2,3] +
          b5.dummy[0] * M[1,2] +
          b6.dummy[0|0] * T[1,2,3] +
          b7.dummy[0|0|0|0|0] * S[1,2,3,4,5,6] +
          b8[0] * T.dummy[1]*S.dummy[1] +
          b9[0] * T.dummy[1]*S.dummy[2] +
          b10[0] * T.dummy[1]*S.dummy[3] +
          b11[0] * T.dummy[1]*S.dummy[4] +
          b12[0] * T.dummy[1]*S.dummy[5] +
          b13[0] * T.dummy[2]*S.dummy[1] +
          b14[0] * T.dummy[2]*S.dummy[2] +
          b15[0] * T.dummy[2]*S.dummy[3] +
          b16[0] * T.dummy[2]*S.dummy[4] +
          b17[0] * T.dummy[2]*S.dummy[5] +
          b18[0] * N.dummy[1]*S.dummy[1] +
          b19[0] * N.dummy[1]*S.dummy[2] +
          b20[0] * N.dummy[1]*S.dummy[3] +
          b21[0] * N.dummy[1]*S.dummy[4] +
          b22[0] * N.dummy[1]*S.dummy[5] +
          b23[0] * N.dummy[2]*S.dummy[1] +
          b24[0] * N.dummy[2]*S.dummy[2] +
          b25[0] * N.dummy[2]*S.dummy[3] +
          b26[0] * N.dummy[2]*S.dummy[4] +
          b27[0] * N.dummy[2]*S.dummy[5] +
          b28[0] * N.dummy[1]*T.dummy[1] +
          b29[0] * N.dummy[1]*T.dummy[2] /
U(alt2) = bB[0] +
          b2 * C[1,2,3,4,5,6] +
          b3.dummy * N[1,2] +
          b4.dummy * R[1,2,3] +
          b5.dummy * M[1,2] +
          b6.dummy * T[1,2,3] +
          b7.dummy * S[1,2,3,4,5,6] 
$
 


  1. Would it be preferred to design a labelled CE, where respondents will have to make a choice always between the two different values of attribute M and where I could see how the variation in attributes alters the choice? Or an unlabelled CE, where I reckon people will be able to compare alternatives with the same value of attribute M, which does make sense in the context of the research? And what is the difference in Ngene coding?
  2. Is the number of rows I specified correct in both designs?
  3. Are the constraints set in the unlabelled CE too many (overlapping) or incorrect?
  4. When I hit “run”, the labelled design runs fine, but I am not sure if it has been specified correctly, because some indicators are not defined, while I am unclear how to interpret others (e.g. D-error=0.805, B estimate=100, S estimate=0, Sp estimates=undefined, Sp t-ratios:=0)
  5. When I hit run, the unlabelled CE runs. However, it gives me the following message and I press stop. What does it mean and what do I need to do?
    Warning: Two alternatives were specified for alternative dominance checking, but do not have the same priors, and so will not be checked. 'alt1', 'alt2'

    The conditional statement nesting cluster 1 contains 36 permissible combinations of attribute levels.
    The nesting cluster contains the following if statements:
    * if(alt1.m=1, alt1.s=[1,2,3])
    * if(alt1.m=2, alt1.s=[4,5,6])
    * if(alt1.m=1 and alt2.m=1, alt1.s=[1,2,3] and alt2.s=[1,2,3])
    * if(alt2.m=1, alt2.s=[1,2,3])
    * if(alt2.m=2, alt2.s=[4,5,6])
    * if(alt1.m=2 and alt2.m=2, alt1.s=[4,5,6] and alt2.s=[4,5,6])
    An attempt will be made to balance the frequency of each level in attributes affected by constraints, however complete balance might not be possible.
    Note: Defaulting to assigning blocks with the 'minsum' method.
    Warning: No valid design has been found after 1000 evaluations. There may be a problem with the specification of the design. A common problem is that the choice probabilities are too extreme (close to 1 and 0), perhaps because some or all of the prior values are too large. Also, it is generally a good idea to start with a simple design (MNL, non-Bayesian), then add complexity. If you press stop, a design will be reported, which may assist in diagnosing the problem.

  6. Are the interaction effects specified in the two models correct? Should they be specified in all (except the opt-out) utility functions or only in one/some of them? If the latter, which one(s)? Does it make sense to include them all or I could include some of them?
  7. Does the mode of selection affect the Ngene design or it is something the researcher may decide to include or otherwise? For example, in the case of 2 alternatives plus the opt-out option, the mode of choice could be: (a) choose one only or (b) if opt-out has been chosen, but you had to make a choice between the two offerings what would you choose. Does (b) make sense in a CE where 2 alternatives plus the opt-out exist?
  8. After the design is generated, will Ngene generate the formatted scenarios that will include the opt-out or this has to be manually done?
  9. How can one decide (based on what) for the number of blocks?
  10. Is there any way to insert the qualitative attributes in the model, other than introducing dummies?

Your help is invaluable.
Thank you in advance.
Michail

Re: Labelled vs Unlabelled design, constraints and interacti

PostPosted: Tue Jan 05, 2021 11:41 am
by Michiel Bliemer
1. The choice for labelled or unlabelled depends on what your alternatives represent. Since you have not mentioned this, I cannot answer this question for you. If it is something like "Medication A" and "Medication B", where there cannot exist any difference in preference for each label, then it is unlabelled and the utility functions must be identical. In your unlabelled specification, however, they are NOT identical because (i) you have used different constants, and (ii) you have included interactions in one of them, but not in the other. So perhaps your alternatives are labelled, like "surgery" and "radiation therapy", which allows different utility functions with different constants, different coefficients for attributes, and different interaction terms. If your alternatives are unlabelled, please use the same constant bA for both alternatives, and add the same interaction terms to alt2.

2. You can specify ;rows any way you like, you can set them both to 36 or both to 48, that is a matter of choice. As long as the number of rows satisfies the degrees of freedom it is fine (and Ngene will give an error if it does not). For attribute level balance, you may want to choose a number that is divisible by the number of levels of each attribute.

3. Your model cannot be estimated with the current specification of constraints, dummy variables, and interaction terms in the unlabelled experiment syntax. You can check in the covariance matrix reported by Ngene that certain parameters have an extremely large variance/standard error, meaning that they cannot be estimated. I suggest that you build up your syntax step by step, first adding the attributes, then adding constraints one by one, and then adding interaction terms. Now there are too many things that may result in your model no longer being able to be estimated and it is difficult for me to pinpont where the exact issue is.

4. Yes this syntax is fine. Everything is defined, except of course sample size estimates because you have specified zero priors and therefore sample size estimates become undefined so you can ignore them.

5. There are multiple issues with this syntax. First, your alternatives are not unlabelled according to your specification. You indicate with a * that alt1 and alt2 are unlabelled, but that means that the utility functions need to be exactly the same, while they are not. Then there is a warning about conditional statement clusters, which you can ignore, it just warns you that it may not be possible to obtain attribute level balance (which is fine). Then there is an error because your model parameters are not identifiable (i.e., cannot be estimated) with your current specification and constraints, see my earlier response under 3.

6. You can include interaction effects in both alt1 and alt2, but it depends on what you want to do. If your alternatives are labelled, you may want to put them only in one or in two, or put different interactions in different alternatives depending on your behavioural hypothesis; if your alternatives are unlabelled then they need to be put in both alt1 and alt2.

7. You can optimise for the unconditional / unforced choice only, or you can optimise for both the unconditional/unforced choice and the conditional/forced choice. This requires a much more advanced syntax and the specification of multiple models in the same syntax. For now, I suggest you first work on getting the correct syntax for the model you want to estimate.

8. You can create formatted scenarios manually where you include the opt-out. Note that this is only for checking the choice tasks yourself, when you implement your questions in a survey instrument this will need to be redone.

9. It depends on how many choice tasks you think a single respondent can handle.

10. Qualitative variables need to be included as dummy coded or effects coded variables, there is no other option.

The amount of questions was a bit overwhelming, so I think you first need to determine whether your alternatives are labelled or not, then specify the utility functions that you want to estimate based on your research questions, then add any constraints (if needed), and only once you have obtained priors from a pilot study I would consider dominance checks and looking at conditional/unconditional choices.

Michiel

Re: Labelled vs Unlabelled design, constraints and interacti

PostPosted: Wed Jan 27, 2021 1:48 am
by Michail
Dear Michiel,

Many thanks for your response and the time you dedicated to answer my queries and I do apologise for the number of questions I included in my post.
I want to follow up on three of your points.

Regarding point 1:
My alternatives can be “Option A” and “Option B” (unlabelled scenario) or can also be given a specific name, say “Organic” and “Traditional” (labelled scenario), which does convey a meaning to respondents. This is why I tried to create two designs.
In the labelled scenario, people will be always making choices between two alternatives that differ in the M attribute (used as label).
In the unlabelled scenario, people will be able to compare alternatives even with the same M attribute but differences in the other attributes (if Ngene finds such a design).

Is my reasoning correct?

Regarding point 3:
I took the time to reflect on your comments and built up the unlabelled specification again from scratch. See below. I attempted several different specifications. Some of them run but do not fully satisfy the conditions I want, while others fail (possibly because of the constraints). I am uncertain whether the constraints I set satisfy the conditions I want (which is: anytime M=1 then S=[1,2,3] and anytime M=2 then S=[4,5,6], regardless the alternative. This is because of the different definitions of S that depend on M levels) or whether the conditions I want are to strong and Ngene fails to find a design.

Are the conditions I want reflected on the constraints I set? Are there mistakes? I include 2 sets of constraints in the design as example.

Code: Select all
? Unlabelled Experiment
Design
;alts = alt1*, alt2*, none
;rows = 48
;eff = (mnl, d)
;
cond:
if(
alt1.= 1 or alt1.= alt2.M and alt1.= 1, alt1.S=[1,2,3]),
if(
alt1.= 2 or alt1.= alt2.M and alt1.= 2, alt1.S=[4,5,6]),
if(
alt2.= 1 or alt1.= alt2.M and alt2.= 1, alt2.S=[1,2,3]),
if(
alt2.= 2 or alt1.= alt2.M and alt2.= 2, alt2.S=[4,5,6])   ? When I add this 4th constraint (with or without interactions), the design fails.

if(
alt1.= 1, alt1.S=[1,2,3]),
if(
alt1.= 2, alt1.S=[4,5,6]),
if(
alt2.= 1, alt2.S=[1,2,3]),
if(
alt2.= 2, alt2.S=[4,5,6])   ? When I add this 4th constraint (with or without interactions), the design fails.
;
block = 6
;model:
U(alt1) = bA[0] +
          b2[0] * C[1,2,3,4,5,6] +
          b3.dummy[0] * N[1,2] +
          b4.dummy[0|0] * R[1,2,3] +
          b5.dummy[0] * M[1,2] +
          b6.dummy[0|0] * T[1,2,3] +
          b7.dummy[0|0|0|0|0] * S[1,2,3,4,5,6] +
          b8[0] * T.dummy[1]*S.dummy[1] +
          b9[0] * T.dummy[1]*S.dummy[2] +
          b10[0] * T.dummy[1]*S.dummy[3] +
          b11[0] * T.dummy[1]*S.dummy[4] +
          b12[0] * T.dummy[1]*S.dummy[5] +
          b13[0] * T.dummy[2]*S.dummy[1] +
          b14[0] * T.dummy[2]*S.dummy[2] +
          b15[0] * T.dummy[2]*S.dummy[3] +
          b16[0] * T.dummy[2]*S.dummy[4] +
          b17[0] * T.dummy[2]*S.dummy[5] /
U(alt2) = bA[0] +
          b2[0] * C[1,2,3,4,5,6] +
          b3.dummy[0] * N[1,2] +
          b4.dummy[0|0] * R[1,2,3] +
          b5.dummy[0] * M[1,2] +
          b6.dummy[0|0] * T[1,2,3] +
          b7.dummy[0|0|0|0|0] * S[1,2,3,4,5,6] +
          b8[0] * T.dummy[1]*S.dummy[1] +
          b9[0] * T.dummy[1]*S.dummy[2] +
          b10[0] * T.dummy[1]*S.dummy[3] +
          b11[0] * T.dummy[1]*S.dummy[4] +
          b12[0] * T.dummy[1]*S.dummy[5] +
          b13[0] * T.dummy[2]*S.dummy[1] +
          b14[0] * T.dummy[2]*S.dummy[2] +
          b15[0] * T.dummy[2]*S.dummy[3] +
          b16[0] * T.dummy[2]*S.dummy[4] +
          b17[0] * T.dummy[2]*S.dummy[5] 
$


Regarding point 6:
Reading the manual and other posts in the forum, my understanding is that interaction effects of dummy/effects coded variables must be all but one included in the utility functions.
Is it sensible to include interaction terms in both utility functions in my labelled specification (original post), given that WS and FS are, in essence, the same attribute defined differently in each alternative (similar to time/transportation mode example, where time takes different levels for different modes)? Does it create any efficiency issues? I understand that this depends on the hypotheses, however, would you advice for or against the inclusion of any interactions?

Many thanks in advance.
Michail

Re: Labelled vs Unlabelled design, constraints and interacti

PostPosted: Mon Feb 08, 2021 6:37 pm
by Michiel Bliemer
Sorry for the late response, I have been extremely busy. I am catching up on all the forum questions now.

If all alternatives have the same attributes, but may be of different type, then you can choose between a labelled and unlabelled experiment:

Option 1: You create an unlabelled experiment where you put the label (Organic and Traditional) as an attribute in an utility function. So then you have Option A and Option B where you may get comparisons across alternatives like (Organic, Traditional), (Organic, Organic), (Traditional, Organic), (Traditional, Traditional). This is mostly useful if you want to determine the willingness to pay for attributes, including Organic and Traditional.

Option 2: You create a labelled experiment where you use Organic and Traditional as fixed labels. This means that all choice tasks will be comparing Organic and Traditional. This is useful if you are interested in determining market shares for Organic and Traditional products. But if a respondent will only by Organic, then this respondent will ignore all other attribute levels and always choose the Organic alternative, therefore not making trade-offs on the other attributes. This is avoided in Option 1.

The way you write your syntax is fine, but given that Ngene cannot generate a design with a finite D-error means that you are over-constraining your design such that certain parameters are no longer identifiable. For example, if I add your fourth constraint (and not the last one), then I can see in the covariance matrix reported by Ngene that b5(d0) and the dummy coded coefficients of b7 have essentially infinite standard errors, so perhaps you can look at the covariance matrix to see exactly why your model can no longer be estimated.

Regarding your last question, in an unlabelled experiment you need to have all interactions appearing in all utillity functions, while for labelled experiments you can have different utility functions but you can still add all interactions in all utility functions and this is just as sensible as in an unlabelled experiment. If you only include interactions in one alternative then the interactions say something about the preference towards that label, whereas if you add interactions in all alternatives then these interactions say something about how preferences for certain attributes are affected by the presence of other attributes (or levels). Both are meaningful, it all depends on your hypothesis and research questions.

Michiel