Page 1 of 1

The trade-off of adding constraints

PostPosted: Wed May 10, 2023 7:28 pm
by yanyanyun
Dear Prof. Bliemer,

I am facing a trade-off by adding an additional attribute with constraints set up required, I would like to seek your suggestions.
My original Ngene code below works fine.
Code: Select all
Design
;alts = A*, B*, neither
;rows = 24
;block = 3
;eff = (mnl,d)
;model:
U(A) =         b1.dummy[0|0|0] * MM[0,1,2,3] +
               b2.dummy[0|0|0] * PP[80,100,140,120] +
               b3.dummy[0|0|0] * OO[1,1.5,2.5,2] +
               b4.dummy[0] * RR[0, 1] +
               b5[0] * Price[5,15,25,35] /

U(B) =         b1 * MM +
               b2 * PP +
               b3 * OO +
               b4 * RR +
               b5 * Price
$

Then, I am thinking of adding an attribute “CC”, but this attribute requires several constraints with the attribute “MM” listed below.
I first tried cond constraints with the below code:
Code: Select all
Design
;alts = A*, B*, neither
;rows = 24
;block = 3
;eff = (mnl,d)
;cond:
if(A.MM = [0,1,2], A.CC = [0,0.4,0.8]),
if(A.MM = [3],     A.CC = [2]),
if(B.MM = [0,1,2], B.CC = [0,0.4,0.8]),
if(B.MM = [3],     B.CC = [2])
;model:
U(A) =         b1.dummy[0|0|0] * MM[0,1,2,3] +
               b2.dummy[0|0|0] * PP[80,100,140,120] +
               b3.dummy[0|0|0] * OO[1,1.5,2.5,2] +
               b4.dummy[0|0|0] * CC[0,0.4,0.8,2] +
               b5.dummy[0] * RR[0, 1] +
               b6[0] * Price[5,15,25,35] /

U(B) =         b1 * MM +
               b2 * PP +
               b3 * OO +
               b4 * CC +
               b5 * RR +
               b6 * Price
$

Then I received the below warning:
“Warning: No valid design has been found after 1000 evaluations. There may be a problem with the specification of the design. A common problem is that the choice probabilities are too extreme (close to 1 and 0), perhaps because some or all of the prior values are too large. Also, it is generally a good idea to start with a simple design (MNL, non-Bayesian), then add complexity. If you press stop, a design will be reported, which may assist in diagnosing the problem.”

Hence, I changed my constraints from cond to reject as below:
Code: Select all
Design
;alts = A*, B*, neither
;rows = 24
;block = 3
;eff = (mnl,d)
;alg = mfederov
;reject:
A.MM <3 and A.CC =2,
A.MM =3 and A.CC <2,
B.MM <3 and B.CC =2,
B.MM =3 and B.CC <2

;model:
U(A) =         b1.dummy[0|0|0] * MM[0,1,2,3] +
               b2.dummy[0|0|0] * PP[80,100,140,120] +
               b3.dummy[0|0|0] * OO[1,1.5,2.5,2] +
               b4.dummy[0|0|0] * CC[0,0.4,0.8,2] +
               b5.dummy[0] * RR[0, 1] +
               b6[0] * Price[5,15,25,35](5-7,5-7,5-7,5-7) /

U(B) =         b1 * MM +
               b2 * PP +
               b3 * OO +
               b4 * CC +
               b5 * RR +
               b6 * Price
$

Issue: code can run in Ngene, but after 1 hour, the only result with MNL D-Error remains Undefined.
Now I am thinking to stick on my original code without restraints for below reasons:
1) New code with constraints did not generate a good result. I checked the “Undefined” result, it shows several attributes with the same level in both alterA and B.
2) By using Modified Federov, I feel if I am adding constraints for the price (5-7,5-7,5-7,5-7), I should do the same for other attributes, but that will largely increase our constraints.

Questions:
1. May I ask if you agree that I should stick on my original code as I have too many constraints?
2. A general question (unrelated to my code): will adding constraints impact the later stage data analysis (e.g. running MNL/mix logit model, calculating MWTP….)? as the attribute level would not be distributed balanced.

Thank you very much!

Yours sincerely,
Yan

Re: The trade-off of adding constraints

PostPosted: Tue May 16, 2023 10:04 am
by Michiel Bliemer
The reason that Ngene cannot find any valid design is because you create perfect correlations within your data when you impose some of the constraints. I think that the offending constraint is:

if if(A.MM = [3], A.CC = [2])

This perfectly correlates two dummy coded variables and hence one of the parameters will not be identifiable. You need to allow some variation across attributes. If MM and CC are numerical, you could avoid the issue by not applying dummy coding.

In general, an Undefined D-error means that the D-error is infinite, which implies that one or more parameters in the model cannot be estimated. This is almost always either an identification issue in the model (e.g., adding too many constants) or imposing too many constraints such that attributes become perfectly correlated and hence multicollinearity issues arise.

To answer your second question, constraints or attribute level balance have no impact on how you estimate the model later on, you estimate the model in the same way.

Michiel