Choice tasks and sample size for mixed logit
Posted: Fri Dec 29, 2023 7:13 pm
Hello, I'm new to DCEs and have been using this forum to improve my understanding. Could I check a few things about my design:
6 attributes, each with 4 levels (2 continuous, 4 categorical - effects coded)
Single profile choice tasks - binary response mechanism (Yes/No)
Parameters to estimate for mnl model = 15 (2x1 + 4x3 + 1 constant)
I have no prior estimates of the coefficients so will run a pilot study first to get these. I'm planning to create a fractional factorial design using a mnl model. I expect there to be preference heterogeneity, so for the main study I would use a mixed logit or latent class model, depending on model performance.
1) Does this mean for generating the fractional factorial design at the pilot stage and deciding number of choice tasks, I should work on the assumption that there will be double the amount of parameters to estimate - because there will be a mean and SD for each coefficient for a mixed logit model? In which case, I would need a minimum of 30 choice tasks (s>=k/(j-1))? Presumably I would then need to multiply this by 3 to ensure enough variation?
2) I understand that sample size estimates aren't very meaningful without known priors, but I want to understand the calculation as I have a limited budget. I have used Orme's approximate formulae at this stage (N>500c/(t*a), where c = largest number of levels, t=number of choice tasks and a = number of alternatives). I know from pre-testing that participants can feasbily complete about 12 choice tasks. My understanding is that the minimum sample size (e.g. N> 500*4 / 12 * 2, so N>83.3) applies to each block of the DCE? So, N=> 84 * number of blocks?
3) My other research question involves looking at income group differences: 40% of the sample will be in income group Level 1 and then 20% will be in each of the 3 other income group levels. I am confused about whether exploring this impacts the necessary sample size. To explore preference differences between the income groups, do I need to treat each income group as a different version of the DCE (i.e. 84 ppts for each income group?), and then would at least 84 need to be in each income group within each block of the DCE?
Thanks very much in advance.
6 attributes, each with 4 levels (2 continuous, 4 categorical - effects coded)
Single profile choice tasks - binary response mechanism (Yes/No)
Parameters to estimate for mnl model = 15 (2x1 + 4x3 + 1 constant)
I have no prior estimates of the coefficients so will run a pilot study first to get these. I'm planning to create a fractional factorial design using a mnl model. I expect there to be preference heterogeneity, so for the main study I would use a mixed logit or latent class model, depending on model performance.
1) Does this mean for generating the fractional factorial design at the pilot stage and deciding number of choice tasks, I should work on the assumption that there will be double the amount of parameters to estimate - because there will be a mean and SD for each coefficient for a mixed logit model? In which case, I would need a minimum of 30 choice tasks (s>=k/(j-1))? Presumably I would then need to multiply this by 3 to ensure enough variation?
2) I understand that sample size estimates aren't very meaningful without known priors, but I want to understand the calculation as I have a limited budget. I have used Orme's approximate formulae at this stage (N>500c/(t*a), where c = largest number of levels, t=number of choice tasks and a = number of alternatives). I know from pre-testing that participants can feasbily complete about 12 choice tasks. My understanding is that the minimum sample size (e.g. N> 500*4 / 12 * 2, so N>83.3) applies to each block of the DCE? So, N=> 84 * number of blocks?
3) My other research question involves looking at income group differences: 40% of the sample will be in income group Level 1 and then 20% will be in each of the 3 other income group levels. I am confused about whether exploring this impacts the necessary sample size. To explore preference differences between the income groups, do I need to treat each income group as a different version of the DCE (i.e. 84 ppts for each income group?), and then would at least 84 need to be in each income group within each block of the DCE?
Thanks very much in advance.