Page 1 of 1

Help for partial profile restrictions - undefined model

PostPosted: Sun Apr 07, 2024 3:37 am
by alicewreford
Good evening,

I am attempting to create experimental design for a partial profile, related to social activity interventions for PhD project. The challenge I am facing is imposing restrictions on 'remote' activities (compared to 'in-person' activities).

The choice task consists of 2 alternatives (unlabeled) of 5 attributes (all dummy coded):
1. Remote [x1/x6 (2 levels)];
2. Distance[ x2/x7 (4 levels)];
3. Activity [x3/x8 (6 levels)];
4. Size [x4/x9 (4 levels)];
5. Refreshments [x5/x10 (3 levels)].

Where conditions are imposed on Distance [x2/x7 (4 levels)] and Refreshments [x5/x10 (4 levels)] dependent on Remote [x1/x6 (2levels)]. Such that:
If Remote=0, Distance = 0 and Refreshments = 0
If Remote=1, Distance = 1,2,3 and Refreshments = 0,1, 2

Using conditions/ restrictions I was unable to find a defined MNL model. Thus I created a candidate set and evaluated using alg=mfederov. The full factorial consisted of 332k observations, with 43k obs remaining once restrictions imposed. I have tried evaluating with various sizes of candidate subsets (6k-35k) and adding in additional restrictions on Remote ("Remote[0,1](4-6, 18-20)") to try to mitigate attribute imbalance.

Syntax:
?MNL all dummy coded with candidate set 1
?Setting priors to almost zero:

Design

;alts= alt1*, alt2*
;rows=24
;block=4
;eff= (mnl,d)

;alg = mfederov(candidates = candidateset1.csv)

;model:
U(alt1)=
bremote.dummy[-0.00001]*Remote[0,1](4-6, 18-20)
+ bdistance.dummy[-0.00003|-0.00002|-0.00001]*Distance[3,2,1,0]
+ bactivity.dummy[0.000001|0.0000011|0.0000012|0.00000111|0.00000112]*Activity[0,1,2,3,4,5]
+ bsize.dummy[0.000001|0.000002|0.0000015]*Size[0,1,2,3]
+ brefreshments.dummy[0.000002|0.000001]*Refreshments[2,1,0] /

U(alt2)= bremote*Remote + bdistance*Distance + bactivity*Activity + bsize*Size + brefreshments*Refreshments
$


When the model does run I am retrieving an MNL D-Error 'Undefined'. Can you please advice on causes or how I might overcome this/ alternative approaches?

Re: Help for partial profile restrictions - undefined model

PostPosted: Sun Apr 07, 2024 4:45 am
by Michiel Bliemer
An undefined D-error means that your model is not identifiable, typically caused by over-specification or by multicollinearity. I think that the issue is with your constraints. Remote=0 only appears with Distance=0 and Refreshments=0, this means that you have created perfect correlations across the base levels of these three variables and hence your model cannot be estimated. This is not a limitation of Ngene but rather a mistake in the formulation of an appropriate utility function.

You will need to change your utility function specification such that the parameters become identifiable. This likely means that you need to collapse the base levels of Remote, Distance, and Refreshments into a single variable. e.g. RemoteDistanceRefreshments[0,1], and remove base levels 0 from Distance and Refreshments and multiply with RemoteDistanceRefreshments so that it disappears from utility when 0. So think carefully about how to specify the utility functions in your case.

Note that a candidate set does not need to be very large, I would not use a candidate set with 50,000 rows as it would make the algorithm extremely slow. Usually around 2,000 to 5,000 is sufficient (in extreme cases I have used 10,000).

Michiel