First of all, thank you for answering all my previous questions. I greatly appreciate it.
As I shared in my previous posting http://choice-metrics.com/forum/viewtopic.php?f=2&t=1003, I am planning to conduct a survey based on a choice experiment with a dual response format. While I'm planning this survey with two phases, a pilot survey followed by a main one, it'd be very helpful if I could get a brief review from experts about the overall plans. Since I am a beginner in this area, I'd appreciate it if you could educate me in case you find anything that would go wrong, or could be done better. Here are the draft plans.
Step 1: Design of a pilot survey
- Sample size of 6 to 10 (approx.)
- Two segments (diesel truck operators vs. natural gas truck operators). 50:50 quota sampling.
- The number of choice tasks per respondent = 6
- Efficient design with non-informative priors
- #rows = 18, #blocks = 3
Step 2: Specification & estimation for the pilot survey
- MNL model
- Maximum likelihood estimation
Step 3: Design of a main survey
- Sample size of 60 to 100 (approx.)
(*Note: Based on the results from the pilot survey, a sample size required could be estimated) - The same two segments; 50:50 quota sampling.
- The number of choice tasks per respondent = 7 (e.g., 6 tasks for calibration + 1 holdout task for validation, tentatively)
(*Note: I'll also consider the 80/20 rule suggested in another posting) - Efficient design with non-zero priors obtained from Step 2
Q. Is there any specific design you'd like to recommend? (e.g., a mixed MNL, a MNL model with Bayesian, etc.) - For the priors, significant estimates from the pilot can be directly used, while non-significant ones may need to be discarded. In the latter case, very small priors can be used, instead. Q. Would you correct me if I am wrong?
- #rows = 18, #blocks = 3
Step 4: Specification & estimation for the main survey
- Potentially, various specifications can be attempted, including:
- 1) MNL
- 2) Mixed MNL (e.g., a random parameter model, a nested logit model allowing for correlations across battery and hydrogen truck options, a panel model allowing for correlations over a series of tasks per respondent), and
- 3) Hybrid choice model (*In my survey questionnaire, Likert statements will also be included to identify some latent variables)
- Estimations: Although I'm still studying this part, it looks like maximum simulated likelihood or hierarchical Bayesian estimation can be used for more complicated models rather than MNL. Q. Do you have any suggestions on this?
Thank you again. If there is anything you'd like to suggest I consider, please let me know. Any inputs will be substantially appreciated.
Best regards,
yb