Page 1 of 2

Efficient Design for Pilot Study

PostPosted: Fri Dec 18, 2020 10:48 pm
by Sibylle
Dear ChoiceMetrics Team,
I am currently working on a study which should include a discrete choice experiment about support offers for companies. For the final survey I plan to work with a baysian efficient design.
For generating the information for the priors I would like to do a pilot study with the following syntax. May I kindly ask you to take a look it and answer some questions?

Code: Select all
Design
;alts=alt1*,alt2*,neither
;rows=10
;eff=(mnl,d)
;cond:
if(alt1.CAPITAL=0,alt1.CAPITALFORM=0),
if(alt2.CAPITAL=0,alt2.CAPITALFORM=0),
if(alt1.PARTNER=[2,3],alt1.CAPITALFORM=[0,1]),
if(alt2.PARTNER=[2,3],alt2.CAPITALFORM=[0,1])
;model:
U(alt1)=capital[0.00001]*CAPITAL[0,10000,25000,100000]
+capitalform.dummy[-0.00001|-0.00001|-0.00002]*CAPITALFORM[1,2,3,0]
+networking.dummy[0.00001|0.00001|0.00002]*NETWORKING[1,2,3,0]
+infrastructure.dummy[0.00001|0.00001|0.00002]*INFRASTRUCTURE[1,2,3,0]
+workshops.dummy[0.00001|0.00001|0.00002]*WORKSHOPS[1,2,3,0]
+partner.dummy[0|0|0]*PARTNER[1,2,3,0]
/
U(alt2)=capital*CAPITAL+capitalform*CAPITALFORM+networking*NETWORKING
+infrastructure*INFRASTRUCTURE+workshops*WORKSHOPS+partner*PARTNER
/
U(neither)=neither[0]
$


1) As you see, I have got two unlabelled alternative, which include six attributes with four levels each and one opt-out alternative. Some levels cannot be combined for logical reasons, so I fit in some constraints. Is my code correct?
2) I can say the direction of the priors except for the attribute "partner". Is it okay to use zero as a prior for this attribute?
3) As I have limted space for the DCE in my survey and a relativly small sample I set the property rows to "10", which is not that much I guess and which is probably on of the reasons why the D-error is quite high with a value of 0.4707. Is it still okay to use this Design for the pilot study?
4) Would you suggest using a different algorithm or sticking to the default one?

Thanks in advance and best regards,
Sibylle

Re: Efficient Design for Pilot Study

PostPosted: Sat Dec 19, 2020 11:17 am
by Michiel Bliemer
1. Yes the syntax looks fine.
2. Yes
3. I recommend that you set the number of rows to a multiple of 4 (in order to obtain attribute level balance in the design) and that you increase the number of rows because of the large number of parameters that you are estimating. So perhaps use ;rows = 16 and ;block = 2, which means creating two versions of the choice experiment, each with 8 choice tasks, where you randomly assign a respondent to one of the versions. Using 10 rows would work but more variation in the data would be desirable.
4. The default algorithm works fine for this type of constraints, there is no need to switch to the modified federov algorithm.

Michiel

Re: Efficient Design for Pilot Study

PostPosted: Mon Dec 21, 2020 4:59 pm
by Sibylle
Many thanks for your help!

Re: Efficient Design for Pilot Study

PostPosted: Sat Jan 09, 2021 1:15 am
by dpotoglou1
Hello Michiel and All,

May I kindly confirm that my code is also correct?

My experiment has two unlabelled alternatives with 9 attributes (5 @ 2 levels, 1 @3 levels and 3 @ 4 levels).

Code: Select all
Design;
alts = alt1*, alt2*, none; ? The '*' checks for dominance and removes any observations
rows = 60;
block = 12;
eff = (mnl,d);
alg = mfederov(stop=noimprov(120 secs));
model:
U(alt1) = b1.dummy[-0.000001] * demo[1,0]                          ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b2.dummy[-0.000001] * perpref[1,0]                       ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b3.dummy[-0.000001] * psyc[1,0]                          ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b4.dummy[-0.000001] * actmon[1,0]                        ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b5.dummy[-0.000001] * phys[1,0]                          ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b6.dummy[-0.000001 | -0.000001] * anon[0,1,2]            ? 2 'you cannot be personally identified' negative, but no expected order for the rest, setting all other parameters zero
        + b7.dummy[0|0|0] * org[1,2,3,0]                           ? Not sure about direction, all parameters are set to zero;
        + b8.dummy[-0.000001|-0.000001|-0.000001] * other[1,2,3,0] ? 0 base: 'None'; all others are expected to be negative; no particular order
        + b9.dummy[-0.000001|-0.000001|-0.000001] * aut[0,1,2,3] / ? 3 base: 'View and delete own data...'; all others should be negative; no particular order
U(alt2) = b1.dummy  * demo
        + b2.dummy  * perpref
        + b3.dummy  * psyc
        + b4.dummy  * actmon
        + b5.dummy  * phys
        + b6.dummy  * anon
        + b7.dummy  * org
        + b8.dummy  * other
        + b9.dummy  * aut /
U(none) = none[0]
$


• The above design still results into some correlations ranging between 0.5 and 0.6 for ‘demo’, ‘perpref’, ‘psyc’ and ‘phys’.

• When I tried the ‘alg = swap(stop=noimprov(120 secs))’ instead of the ‘mfederov(stop=noimprov(120 secs))’, I get the following message:
“A valid initial random design could not be generated after approximately 10 seconds. In this time, of the 57499 attempts made, there were 0 row repetitions, 45 alternative repetitions, and 57454 cases of dominance. There are a number of possible causes for this, including the specification of too many constraints, not having enough attributes or attribute levels for the number of rows required, and the use of too many scenario attributes. A design may yet be found, and the search will continue for 10 minutes. Alternatively, you can stop the run and alter the syntax.”

* Another option would be to go with an orthogonal optimal design (orth = ood), but I receive a ‘terminal error’.

I would appreciate your feedback on the above code.

Thank you,
Dimitris

Re: Efficient Design for Pilot Study

PostPosted: Sat Jan 09, 2021 10:47 am
by Michiel Bliemer
Yes the syntax looks good.

* Efficient designs do not intend to remove correlations, they aim to increase information and reduce standard errors. Only orthogonal designs aim to remove correlations. Having correlations is not a problem at all and is often even useful. Only having (near) perfect correlation (e.g. 0.99 or 1.00) is problematic.

* As the message states, the swapping algorithm leads to many choice tasks that violate the dominance criterion. Finding a design with 60 rows without any dominance is difficult for the row-based swapping algorithm, but much easier for the row-based mfederov algorithm. So it is correct to use mfederov here. This algorithm is slower so please run the algorithm for a while (e.g. overnight).

* orthogonal designs cannot avoid dominant alternatives so I would use your syntax as is. We will investigate why this generates a terminal error.

Michiel

Re: Efficient Design for Pilot Study

PostPosted: Mon Jan 11, 2021 9:33 pm
by dpotoglou1
Thank you, Michiel.

That's very useful.

I appreciate d-efficient designs are aimed at minimising standard errors of parameters, not to eliminate correlations, but I am always a little cautious especially when correlations are high. I agree we can go with 0.5/0.6.

Having tried different runs, when using the 'swap' algorithm, I noticed the pattern of 'higher' correlations occurring in the first few attributes and then these correlations seem to be 'tailing off'. Not sure if there is an explanation for that, although I appreciate it may not be necessary.

I have now increased the iterations of 'mfederov' to 10000000 so I can give the algorithm a little more time to run - as you suggested.

Finally, I would be keen to know about any updates with regard to ood designs, but no pressure at all.

Dimitris

Re: Efficient Design for Pilot Study

PostPosted: Wed Jan 13, 2021 10:07 am
by Michiel Bliemer
I am not sure what you mean with correlations "tailing off", but I never really look at correlations in the design because they are not really important. If there is multicollinearity (perfect correlation) then the D-error will be infinite (or very large), so as long as D-errors are finite there will be no issue the data for model estimation.

Michiel

Re: Efficient Design for Pilot Study

PostPosted: Wed Feb 03, 2021 3:15 am
by dpotoglou1
Thank you, Michiel and sorry about the belated reply.

I just observed that following 1m and 2m iterations using the above algorithm that there is a pattern in the correlations between the same attributes across two alternatives.

These correlations tend to be higher for the first few attributes and then they appear smaller for the remaining ones - hence my 'tailing off' term.

1m iterations
[url]
https://cf-my.sharepoint.com/:i:/g/pers ... w?e=bxR9t2
[/url]

2m iterations
[url]
https://cf-my.sharepoint.com/:i:/g/pers ... A?e=gsG10M
[/url]

Dimitris

Re: Efficient Design for Pilot Study

PostPosted: Mon Feb 08, 2021 6:16 pm
by Michiel Bliemer
That is because your first five attributes only have 2 levels while your sixth attribute has 3 levels and your last attributes have 4 levels. The more levels, the more variation in combinations across alternatives, and the fewer correlation there will be.

Consider an attribute with two levels. Then you can make the following combinations across two alternatives:
(1,0) - here you get a trade-off
(0,1) - here you get a trade-off
(1,1) - here you do NOT get a trade-off
(0,0) - here you do NOT get a trade-off

An efficient design will have little overlap, thererfore (1,0) and (0,1) are more efficient than (1,1) and (0,0), so values (1,0) and (0,1) will appear more frequently in the design, but this means that there will be more correlation because it is not balanced with the appearance of (1,1) and (0,0). Therefore, correlations are actually good from an efficiency point of view.

Michiel

Re: Efficient Design for Pilot Study

PostPosted: Sun Apr 11, 2021 10:14 pm
by dpotoglou1
Dear Michiel,

Just following up on the above query to ask about the balance of a D-efficient design generated with priors from a small pilot study.

We conducted a pilot with 29 respondents (x5 choice tasks each).

I estimated an MNL model after removing observations from individuals who always chose ‘Scenario B’ (1 individual) and the Neither’ (2 individuals) options all the way in the 5 choice tasks (see, estimates below).
https://cf-my.sharepoint.com/:i:/g/personal/potogloud_cardiff_ac_uk/EWCVN_zUsLhKqAKY_yZVi_YBzZA2V1xOqNX5_Gf7wY3U0g?e=wuzOne

I used the priors from the far-right model (see, link above; accepting that some coefficients were not significant and some of them changed relative to the full-data model) to generate a D-efficient design with the following code:

Code: Select all
? Thermal Comfort_v3.ngs --- Experimental Design  --- 06. 03. 2021
? Total 9 Attributes
? [1] demo[0,1] --- Demographics (e.g. age, gender)
? [2] perpref[0,1]--- Psychological parameters (e.g. personal preferences and attitudes)
? [3] psyc[0,1] --- Physical parameters (e.g. room temperature, noise level, illuminance)
? [4] actmon[0,1] --- Activity monitoring (e.g. presence, interaction with windows)
? [5] phys[0,1] --- Physiological data (e.g. heart rate)

? [6] anon[0,1,2] --- Level of anonymity
? [7] org[0,1,2,3] --- Responsible organisation for data collection and use
? [8] other[0,1,2,3] --- Other uses of the data
? [9] aut[0,1,2,3] --- Level of autonomy

? Each choice card will include 2 unlabelled alternative options and a none alternative
? 60 potential choice cards into 12 blocks so that each respondent receives 5 cards (scenarios)

Design;
alts = alt1*, alt2*, none; ? The '*' checks for dominance and removes any observations
rows = 60;
block = 12;
eff = (mnl,d);
alg = mfederov(stop=total(2000000 iterations));
model:
U(alt1) = b1.dummy[-0.050] * demo[1,0]                             ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b2.dummy[-0.363] * perpref[1,0]                          ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b3.dummy[-0.746] * psyc[1,0]                             ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b4.dummy[0.164]  * actmon[1,0]                           ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b5.dummy[-0.120] * phys[1,0]                             ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b6.dummy[-0.535 | -0.837] * anon[0,1,2]                  ? 2 'you cannot be personally identified' negative, but no expected order for the rest, setting all other parameters zero
        + b7.dummy[1.490|0.629|-0.844] * org[1,2,3,0]              ? Not sure about direction, all parameters are set to zero;
        + b8.dummy[-1.040|1.180|-0.200] * other[1,2,3,0]          ? 0 base: 'None'; all others are expected to be negative; no particular order
        + b9.dummy[-1.240|-1.710|-1.230] * aut[0,1,2,3] / ? 3 base: 'View and delete own data...'; all others should be negative; no particular order
U(alt2) = b1.dummy  * demo
        + b2.dummy  * perpref
        + b3.dummy  * psyc
        + b4.dummy  * actmon
        + b5.dummy  * phys
        + b6.dummy  * anon
        + b7.dummy  * org
        + b8.dummy  * other
        + b9.dummy  * aut /
U(none) = none[-3.140]
$


Very kindly, I have two questions:

* Could you provide me with some feedback on the above code? Should I rather go with a Bayesian design given that some coefficients are not significant?

* I am slightly concerned because the design is not very well balanced – for example the level ‘3’ of the attribute ‘org’ only appears 8 times compared to the other levels (Freq > 15). I would welcome your thoughts on whether this should be anticipated given the new priors.
https://cf-my.sharepoint.com/:i:/g/personal/potogloud_cardiff_ac_uk/EWD-SC4c0chFhE6TBY3RzHcB93d2XhC2z22TVYTyzyzXLQ?e=kHkYfJ


Thank you,
Dimitris