Efficient Design for Pilot Study

This forum is for posts that specifically focus on Ngene.

Moderators: Andrew Collins, Michiel Bliemer, johnr

Efficient Design for Pilot Study

Postby Sibylle » Fri Dec 18, 2020 10:48 pm

Dear ChoiceMetrics Team,
I am currently working on a study which should include a discrete choice experiment about support offers for companies. For the final survey I plan to work with a baysian efficient design.
For generating the information for the priors I would like to do a pilot study with the following syntax. May I kindly ask you to take a look it and answer some questions?

Code: Select all
Design
;alts=alt1*,alt2*,neither
;rows=10
;eff=(mnl,d)
;cond:
if(alt1.CAPITAL=0,alt1.CAPITALFORM=0),
if(alt2.CAPITAL=0,alt2.CAPITALFORM=0),
if(alt1.PARTNER=[2,3],alt1.CAPITALFORM=[0,1]),
if(alt2.PARTNER=[2,3],alt2.CAPITALFORM=[0,1])
;model:
U(alt1)=capital[0.00001]*CAPITAL[0,10000,25000,100000]
+capitalform.dummy[-0.00001|-0.00001|-0.00002]*CAPITALFORM[1,2,3,0]
+networking.dummy[0.00001|0.00001|0.00002]*NETWORKING[1,2,3,0]
+infrastructure.dummy[0.00001|0.00001|0.00002]*INFRASTRUCTURE[1,2,3,0]
+workshops.dummy[0.00001|0.00001|0.00002]*WORKSHOPS[1,2,3,0]
+partner.dummy[0|0|0]*PARTNER[1,2,3,0]
/
U(alt2)=capital*CAPITAL+capitalform*CAPITALFORM+networking*NETWORKING
+infrastructure*INFRASTRUCTURE+workshops*WORKSHOPS+partner*PARTNER
/
U(neither)=neither[0]
$


1) As you see, I have got two unlabelled alternative, which include six attributes with four levels each and one opt-out alternative. Some levels cannot be combined for logical reasons, so I fit in some constraints. Is my code correct?
2) I can say the direction of the priors except for the attribute "partner". Is it okay to use zero as a prior for this attribute?
3) As I have limted space for the DCE in my survey and a relativly small sample I set the property rows to "10", which is not that much I guess and which is probably on of the reasons why the D-error is quite high with a value of 0.4707. Is it still okay to use this Design for the pilot study?
4) Would you suggest using a different algorithm or sticking to the default one?

Thanks in advance and best regards,
Sibylle
Sibylle
 
Posts: 2
Joined: Tue Dec 15, 2020 7:57 pm

Re: Efficient Design for Pilot Study

Postby Michiel Bliemer » Sat Dec 19, 2020 11:17 am

1. Yes the syntax looks fine.
2. Yes
3. I recommend that you set the number of rows to a multiple of 4 (in order to obtain attribute level balance in the design) and that you increase the number of rows because of the large number of parameters that you are estimating. So perhaps use ;rows = 16 and ;block = 2, which means creating two versions of the choice experiment, each with 8 choice tasks, where you randomly assign a respondent to one of the versions. Using 10 rows would work but more variation in the data would be desirable.
4. The default algorithm works fine for this type of constraints, there is no need to switch to the modified federov algorithm.

Michiel
Michiel Bliemer
 
Posts: 1705
Joined: Tue Mar 31, 2009 4:13 pm

Re: Efficient Design for Pilot Study

Postby Sibylle » Mon Dec 21, 2020 4:59 pm

Many thanks for your help!
Sibylle
 
Posts: 2
Joined: Tue Dec 15, 2020 7:57 pm

Re: Efficient Design for Pilot Study

Postby dpotoglou1 » Sat Jan 09, 2021 1:15 am

Hello Michiel and All,

May I kindly confirm that my code is also correct?

My experiment has two unlabelled alternatives with 9 attributes (5 @ 2 levels, 1 @3 levels and 3 @ 4 levels).

Code: Select all
Design;
alts = alt1*, alt2*, none; ? The '*' checks for dominance and removes any observations
rows = 60;
block = 12;
eff = (mnl,d);
alg = mfederov(stop=noimprov(120 secs));
model:
U(alt1) = b1.dummy[-0.000001] * demo[1,0]                          ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b2.dummy[-0.000001] * perpref[1,0]                       ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b3.dummy[-0.000001] * psyc[1,0]                          ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b4.dummy[-0.000001] * actmon[1,0]                        ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b5.dummy[-0.000001] * phys[1,0]                          ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b6.dummy[-0.000001 | -0.000001] * anon[0,1,2]            ? 2 'you cannot be personally identified' negative, but no expected order for the rest, setting all other parameters zero
        + b7.dummy[0|0|0] * org[1,2,3,0]                           ? Not sure about direction, all parameters are set to zero;
        + b8.dummy[-0.000001|-0.000001|-0.000001] * other[1,2,3,0] ? 0 base: 'None'; all others are expected to be negative; no particular order
        + b9.dummy[-0.000001|-0.000001|-0.000001] * aut[0,1,2,3] / ? 3 base: 'View and delete own data...'; all others should be negative; no particular order
U(alt2) = b1.dummy  * demo
        + b2.dummy  * perpref
        + b3.dummy  * psyc
        + b4.dummy  * actmon
        + b5.dummy  * phys
        + b6.dummy  * anon
        + b7.dummy  * org
        + b8.dummy  * other
        + b9.dummy  * aut /
U(none) = none[0]
$


• The above design still results into some correlations ranging between 0.5 and 0.6 for ‘demo’, ‘perpref’, ‘psyc’ and ‘phys’.

• When I tried the ‘alg = swap(stop=noimprov(120 secs))’ instead of the ‘mfederov(stop=noimprov(120 secs))’, I get the following message:
“A valid initial random design could not be generated after approximately 10 seconds. In this time, of the 57499 attempts made, there were 0 row repetitions, 45 alternative repetitions, and 57454 cases of dominance. There are a number of possible causes for this, including the specification of too many constraints, not having enough attributes or attribute levels for the number of rows required, and the use of too many scenario attributes. A design may yet be found, and the search will continue for 10 minutes. Alternatively, you can stop the run and alter the syntax.”

* Another option would be to go with an orthogonal optimal design (orth = ood), but I receive a ‘terminal error’.

I would appreciate your feedback on the above code.

Thank you,
Dimitris
dpotoglou1
 
Posts: 4
Joined: Fri Jan 08, 2021 2:56 am

Re: Efficient Design for Pilot Study

Postby Michiel Bliemer » Sat Jan 09, 2021 10:47 am

Yes the syntax looks good.

* Efficient designs do not intend to remove correlations, they aim to increase information and reduce standard errors. Only orthogonal designs aim to remove correlations. Having correlations is not a problem at all and is often even useful. Only having (near) perfect correlation (e.g. 0.99 or 1.00) is problematic.

* As the message states, the swapping algorithm leads to many choice tasks that violate the dominance criterion. Finding a design with 60 rows without any dominance is difficult for the row-based swapping algorithm, but much easier for the row-based mfederov algorithm. So it is correct to use mfederov here. This algorithm is slower so please run the algorithm for a while (e.g. overnight).

* orthogonal designs cannot avoid dominant alternatives so I would use your syntax as is. We will investigate why this generates a terminal error.

Michiel
Michiel Bliemer
 
Posts: 1705
Joined: Tue Mar 31, 2009 4:13 pm

Re: Efficient Design for Pilot Study

Postby dpotoglou1 » Mon Jan 11, 2021 9:33 pm

Thank you, Michiel.

That's very useful.

I appreciate d-efficient designs are aimed at minimising standard errors of parameters, not to eliminate correlations, but I am always a little cautious especially when correlations are high. I agree we can go with 0.5/0.6.

Having tried different runs, when using the 'swap' algorithm, I noticed the pattern of 'higher' correlations occurring in the first few attributes and then these correlations seem to be 'tailing off'. Not sure if there is an explanation for that, although I appreciate it may not be necessary.

I have now increased the iterations of 'mfederov' to 10000000 so I can give the algorithm a little more time to run - as you suggested.

Finally, I would be keen to know about any updates with regard to ood designs, but no pressure at all.

Dimitris
dpotoglou1
 
Posts: 4
Joined: Fri Jan 08, 2021 2:56 am

Re: Efficient Design for Pilot Study

Postby Michiel Bliemer » Wed Jan 13, 2021 10:07 am

I am not sure what you mean with correlations "tailing off", but I never really look at correlations in the design because they are not really important. If there is multicollinearity (perfect correlation) then the D-error will be infinite (or very large), so as long as D-errors are finite there will be no issue the data for model estimation.

Michiel
Michiel Bliemer
 
Posts: 1705
Joined: Tue Mar 31, 2009 4:13 pm

Re: Efficient Design for Pilot Study

Postby dpotoglou1 » Wed Feb 03, 2021 3:15 am

Thank you, Michiel and sorry about the belated reply.

I just observed that following 1m and 2m iterations using the above algorithm that there is a pattern in the correlations between the same attributes across two alternatives.

These correlations tend to be higher for the first few attributes and then they appear smaller for the remaining ones - hence my 'tailing off' term.

1m iterations
[url]
https://cf-my.sharepoint.com/:i:/g/pers ... w?e=bxR9t2
[/url]

2m iterations
[url]
https://cf-my.sharepoint.com/:i:/g/pers ... A?e=gsG10M
[/url]

Dimitris
dpotoglou1
 
Posts: 4
Joined: Fri Jan 08, 2021 2:56 am

Re: Efficient Design for Pilot Study

Postby Michiel Bliemer » Mon Feb 08, 2021 6:16 pm

That is because your first five attributes only have 2 levels while your sixth attribute has 3 levels and your last attributes have 4 levels. The more levels, the more variation in combinations across alternatives, and the fewer correlation there will be.

Consider an attribute with two levels. Then you can make the following combinations across two alternatives:
(1,0) - here you get a trade-off
(0,1) - here you get a trade-off
(1,1) - here you do NOT get a trade-off
(0,0) - here you do NOT get a trade-off

An efficient design will have little overlap, thererfore (1,0) and (0,1) are more efficient than (1,1) and (0,0), so values (1,0) and (0,1) will appear more frequently in the design, but this means that there will be more correlation because it is not balanced with the appearance of (1,1) and (0,0). Therefore, correlations are actually good from an efficiency point of view.

Michiel
Michiel Bliemer
 
Posts: 1705
Joined: Tue Mar 31, 2009 4:13 pm

Re: Efficient Design for Pilot Study

Postby dpotoglou1 » Sun Apr 11, 2021 10:14 pm

Dear Michiel,

Just following up on the above query to ask about the balance of a D-efficient design generated with priors from a small pilot study.

We conducted a pilot with 29 respondents (x5 choice tasks each).

I estimated an MNL model after removing observations from individuals who always chose ‘Scenario B’ (1 individual) and the Neither’ (2 individuals) options all the way in the 5 choice tasks (see, estimates below).
https://cf-my.sharepoint.com/:i:/g/personal/potogloud_cardiff_ac_uk/EWCVN_zUsLhKqAKY_yZVi_YBzZA2V1xOqNX5_Gf7wY3U0g?e=wuzOne

I used the priors from the far-right model (see, link above; accepting that some coefficients were not significant and some of them changed relative to the full-data model) to generate a D-efficient design with the following code:

Code: Select all
? Thermal Comfort_v3.ngs --- Experimental Design  --- 06. 03. 2021
? Total 9 Attributes
? [1] demo[0,1] --- Demographics (e.g. age, gender)
? [2] perpref[0,1]--- Psychological parameters (e.g. personal preferences and attitudes)
? [3] psyc[0,1] --- Physical parameters (e.g. room temperature, noise level, illuminance)
? [4] actmon[0,1] --- Activity monitoring (e.g. presence, interaction with windows)
? [5] phys[0,1] --- Physiological data (e.g. heart rate)

? [6] anon[0,1,2] --- Level of anonymity
? [7] org[0,1,2,3] --- Responsible organisation for data collection and use
? [8] other[0,1,2,3] --- Other uses of the data
? [9] aut[0,1,2,3] --- Level of autonomy

? Each choice card will include 2 unlabelled alternative options and a none alternative
? 60 potential choice cards into 12 blocks so that each respondent receives 5 cards (scenarios)

Design;
alts = alt1*, alt2*, none; ? The '*' checks for dominance and removes any observations
rows = 60;
block = 12;
eff = (mnl,d);
alg = mfederov(stop=total(2000000 iterations));
model:
U(alt1) = b1.dummy[-0.050] * demo[1,0]                             ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b2.dummy[-0.363] * perpref[1,0]                          ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b3.dummy[-0.746] * psyc[1,0]                             ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b4.dummy[0.164]  * actmon[1,0]                           ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b5.dummy[-0.120] * phys[1,0]                             ? 0 base is 'no'; 1 'yes' and parameter is expected to be negative
        + b6.dummy[-0.535 | -0.837] * anon[0,1,2]                  ? 2 'you cannot be personally identified' negative, but no expected order for the rest, setting all other parameters zero
        + b7.dummy[1.490|0.629|-0.844] * org[1,2,3,0]              ? Not sure about direction, all parameters are set to zero;
        + b8.dummy[-1.040|1.180|-0.200] * other[1,2,3,0]          ? 0 base: 'None'; all others are expected to be negative; no particular order
        + b9.dummy[-1.240|-1.710|-1.230] * aut[0,1,2,3] / ? 3 base: 'View and delete own data...'; all others should be negative; no particular order
U(alt2) = b1.dummy  * demo
        + b2.dummy  * perpref
        + b3.dummy  * psyc
        + b4.dummy  * actmon
        + b5.dummy  * phys
        + b6.dummy  * anon
        + b7.dummy  * org
        + b8.dummy  * other
        + b9.dummy  * aut /
U(none) = none[-3.140]
$


Very kindly, I have two questions:

* Could you provide me with some feedback on the above code? Should I rather go with a Bayesian design given that some coefficients are not significant?

* I am slightly concerned because the design is not very well balanced – for example the level ‘3’ of the attribute ‘org’ only appears 8 times compared to the other levels (Freq > 15). I would welcome your thoughts on whether this should be anticipated given the new priors.
https://cf-my.sharepoint.com/:i:/g/personal/potogloud_cardiff_ac_uk/EWD-SC4c0chFhE6TBY3RzHcB93d2XhC2z22TVYTyzyzXLQ?e=kHkYfJ


Thank you,
Dimitris
dpotoglou1
 
Posts: 4
Joined: Fri Jan 08, 2021 2:56 am

Next

Return to Choice experiments - Ngene

Who is online

Users browsing this forum: No registered users and 12 guests