On sample size, choice tasks and alternatives

This forum is for posts that specifically focus on Ngene.

Moderators: Andrew Collins, Michiel Bliemer, johnr

On sample size, choice tasks and alternatives

Postby Joy_Lawrence » Mon Aug 19, 2024 11:59 pm

Hello, this question is for clarifying a few doubts about a DCE on participants' WTP for a new technology for EV cars for my project....and I will be so grateful to receive some initial support on this as I am new to Choice studies.

1)For the initial ethics approval we need to state the sample size (minimum) and the formula for it. I am thinking of using the Orne's rule of thumb N=500L/S*A. I am using this because true sample estimate can be found only using priors from pilot and without ethics approval we cannot get there yet. Is it fine?
2)But even to apply this formula, I need the number of choice tasks required. My alternatives are right now two, with 7 attributes which have 2,3,4,5 levels across them. As per the rule of thumb for attribute level balance, the least common multiple is 60 for (2,3,4,5) , so can I use 60 as my number of choice tasks? Is it not too much? And using this leads to only >20 sample size through Orne rule, which looks unreasonable. If you suggest blocking then how?
3)On the other hand, for number of choice tasks, if I use the formula S(J-1) >= K, then I need to know the number of parameters. Can I get a help on this parameter calculation when variables are a mix of dummy and continuous? e.g. I have 7 attributes, 6 of which are categorical or yes/no dummy and just 1 is a continuous variable. The following are the levels of these attributes (where Only Attribute-2 is continuous) :
A=Attrib-1= 3 levels
B=Attrib-2 (continuous) = 3 levels
C=Attrib-3= 4 levels
D=Attrib-4= 5 levels
E=Attrib-5= 4 levels
F=Attrib-6= 4 levels
G=Attrib-7= 2 levels

So am I right in finding num of parameters, k= 19 from above? Since we are supposed to do (L-1) for every categorical attribute as there is a base.
Using 19 as S in Orne rule gives me around 66 as minimum sample size.

3) I tried to put two simple utility functions for two alternatives with these attributes above in ngene, but it gives ''no design found'' error when i mention orthogonal design in the command. Does it mean an orthogonal design cannot be formed with these set of info?
4) In Ngene we need to mention ''rows'' , how exactly do we ascertain how many rows?

5) I have a lot of socio-demo variables as well, am I supposed to include those also in Utility functions to be fed into NGENE? Will these extra variables affect the number of parameters and therefore number of choice tasks and sample size?

6)For efficient design we are supposed to write (-0.00001) or (0.00001) in ngene command, but what if for a few categorical neutral-ish attributes we do not really know whether they positively affect decision or negatively? for example, One Attribute=Location of petrol station: Near workplace, En-route, Destination. How to give a =ve or -ve sign for such an attribute? Can we leave it just like that with no prior brackets at all in our command? Although the other attributes maybe we know the signs for and have put brackets, except this one.
7) Regarding finding the number of alternatives, is there any way or rule? Given we have one new technology whose WTP we want to find, that makes it one alternative. So all the past techs can be Alternative 2? or can we provide more alternatives? Any suggestion or reference, on this will be really helpful to me.
Thanks

Below is an example of the commands that I am toggling with for my model discussed above. Here A,B,C .... are the attributes where only B is continuous, rest are dummy /categorical:

design
;alts= Tech1*,Tech2*
;rows =19
;eff = (mnl,d)
;model:
U(Tech1) = b1 [0.00001]*A[0,1,2]
+ b2[-0.00001] *B[25,57,80]
+b3[0.00001] *C[0,1,2,3]
+ b4[0.00001] *D[0,1,2,3,4]
+ b5[-0.00001] *E[0,1,2,3]
+ b6[0.00001] *F[0,1,2,3]
+b7[-0.00001]*G[0,1] /

U(Tech2) = b1* A+ b2* B + b3* C + b4* D + b5* E + b6* F+b7* G
$
Joy_Lawrence
 
Posts: 3
Joined: Mon Aug 19, 2024 2:43 am

Re: On sample size, choice tasks and alternatives

Postby Michiel Bliemer » Tue Aug 20, 2024 9:52 am

1) Unlike other models, there is no such sample size formula for discrete choice models unless you have priors. We have the same problem with our ethics office and we always need to convince them that our sample size is reasonable by referring to other studies with similar sample sizes. I have never used Orme's rule of thumb for calculations myself and I do not know how accurate it is. Sample size and power calculations really depend on how important the effect is in your study, for some attributes you may require only 10 respondents to get statistically significant effects, but for other attributes that are less relevant this may be thousands. Orme's rule does not consider this.

2) You will likely not give 60 choice tasks to a single respondent, so if you split these choice tasks into blocks then you need to multiply with the number of blocks to get the sample size. For example, if you split the design with 60 choice tasks into 6 blocks of 10 choice tasks each (so you are giving each respondent 10 choice tasks), then you need to multiply the sample size estimate with 6. So perhaps in your case this means 6*20 = 120, which does not sound unreasonable.

3) That formula is for determining the MINIMUM required design size. My rule of thumb is to multiply with 3 to get sufficient variation in your data. And 3*19 is again close to a design size of 60.

3) Correct, it means that there does not exist an orthogonal design for attributes that have 2, 3, 4, and 5 levels. If you only have 2 or 4 levels then an orthogonal design may exist. Instead, you can use a D-efficient design with (near) zero priors and dummy code all attributes, Ngene will always be able to generate such a design.

4) Using the formula S*(J-1) >=K and then applying my rule of thumb to multiply S by 3. So this would end up again with rows = 60, and you can add something like ;block = 6.

5) No you typically omit socio-demographic variables when generating a design, but if you believe the behaviour of different groups varies widely then you could consider using attribute.covar as explained in the Ngene manual on adding socio-demographic variables. But generally there is no need to do this, you add these variables when you start estimating models.

6) If there is no clear preference order for attributes then you keep the priors zero. This typically holds for variables of nominal measurement scale. You could use b_location.dummy[0|0] * location[1,2,0] where 0 is the base level and priors are specified for levels 1 and 2. So in your script you will need to specify dummy coding, currently all your attributes are considered numerical.

7) You need to determine whether you need a labelled or unlabelled experiment. Unlabelled is generally sufficient for determining WTP and you would vary the new technology with generic alternatives. If you want to forecast the market share or demand for a new technology, you usually use a labelled experiment where you have different labels for the new and old technologies.

Given that you are new to experimental design and have a lot of questions, you may want to register for this online choice modelling course that is starting on 3 September and runs for 8 weeks: https://www.choicemodelling.academy/
I teach the part on experimental design using Ngene. This course essentially answers all questions that you posed above,

Michiel
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm


Return to Choice experiments - Ngene

Who is online

Users browsing this forum: No registered users and 35 guests