Estimating baysian priors from a pilot study

This forum is for posts that specifically focus on Ngene.

Moderators: Andrew Collins, Michiel Bliemer, johnr

Estimating baysian priors from a pilot study

Postby TamsinD » Thu Apr 25, 2024 7:12 pm

Hello Choice Metrics team,
I am new to choice modelling and I am attempting to do a labelled experiment with 2 transport alternatives with 7 attributes.
I have just collected a pilot sample consisting of 200 respondents. Each participant was presented with 5 choice questions from an orthogonal design.
The output from MNL model using the pilot data is as follows. Some attributes aren’t significant.
Estimated parameters with approximate standard errors from BHHH matrix:
Estimate BHHH se BHH t-ratio (0)
asc_opt1 0.000000 NA NA
asc_opt2 -0.002910 0.769784 0.003781
b1_d 0.428036 0.193494 2.212141
b1_s 0.000000 NA NA
b1_c -0.312258 0.198598 -1.572313
b1_n -0.299048 0.201146 -1.486726
b2_j -0.019784 0.009211 -2.147976
b3_wk -0.060960 0.013719 -4.443643
b4_wt 0.009319 0.024181 0.385376
b5_f -0.141534 0.049345 -2.868238
b7_j -0.033064 0.018807 -1.758096
b8_f -0.057674 0.049403 -1.167416
b9_p -0.105266 0.013544 -7.772286
b6_e 0.257088 0.070776 3.632405

Final LL: -624.1295

Is it still possible to use Baysian priors to create a choice set from the above results for the full survey as many of the priors are very close to zero?
design
;alts = opt1, opt2
;block = 9
? efficient design
;eff = (mnl, d, mean)
;alg = swap
;rows = 36
;bdraws = sobol(200)

;model:
U(opt1) = b1.dummy[0.42|-0.31|-0.29] * x1[0,2,3,1]
+ b2[(n,-0.02,0.01)] * x2[30,40,50]
+ b3[(n,-0.06,0.01)] * x3[10,20]
+ b4[0] * x4[1,5,10]
+ b5[(n,-0.1,0.05)] * x5[2,3,4,5,6]
+ b6[(n,0.26,0.07)] * x7[0,1]
+ asc1[(n,-0.002,0.77)]
/
U(opt2) = b7[(n,-0.03,0.019)] * x2_2[30,35,40]
+ b8[(n,-0.06,0.05)] * x5_2[1,2,3,4,5]
+ b9[(n,-0.11,0.01)] * x6_2[0,5,10,15]
+ b6 * x7
TamsinD
 
Posts: 6
Joined: Tue Apr 23, 2024 9:00 pm

Re: Estimating baysian priors from a pilot study

Postby Michiel Bliemer » Thu Apr 25, 2024 7:39 pm

Yes that looks fine. Note that priors close to zero does not mean anything as the parameter values depend on the unit of the attribute. So if attribute levels are 10000, 20000, 30000 then the parameters will have values very close to zero. It is the contribution to utility that matters, i.e. beta * X.

For that many Bayesian priors you may want to increase the number of Sobol draws.

Michiel
Michiel Bliemer
 
Posts: 1741
Joined: Tue Mar 31, 2009 4:13 pm

Re: Estimating baysian priors from a pilot study

Postby TamsinD » Fri Apr 26, 2024 6:37 pm

Thank you very much for your quick feedback and your suggestion to increase sobol draws. It is very reassuring to know that it looks fine. I have a couple more questions.

Question 1 Wait time
From the pilot MNL, bus wait time appears positive (n, 0.01, 0.02) which can’t be correct (though it is also not significant). What is the best way to manage the priors for wait time? For example, could I put it as (u,-0.000001,0)?

Question 2 Different scenarios
I am planning on introducing an additional scenario (travelling when the weather is bad) to the choice sets when distributing to the full response panel. Would it be OK using these priors or would it be worthwhile conducting another pilot for bad weather?
TamsinD
 
Posts: 6
Joined: Tue Apr 23, 2024 9:00 pm

Re: Estimating baysian priors from a pilot study

Postby Michiel Bliemer » Sat Apr 27, 2024 11:14 pm

Q1: If the coefficient has an unexpected sign then I would manually adjust it. But (u,-0.00001,0) is essentially a local prior of 0 because the support is extremely narrow, so then you may as well use 0 as prior. So you would need to adjust the lower bound of the uniform distribution, or you could take the prior from the travel time coefficient, which is expected to be closer to 0 than the waiting time coefficient but it is maybe a sufficient approximation. Perhaps check the experimental design to see if there is a reason that the waiting time coefficient is positive. Are there perhaps correlations between your attributes whereby a high waiting time always appears with a low in-vehicle travel time or another beneficial attribute level?

Q2: I would not do another pilot study merely for one more prior, you could simply set the prior for weather to zero.

Michiel
Michiel Bliemer
 
Posts: 1741
Joined: Tue Mar 31, 2009 4:13 pm

Re: Estimating baysian priors from a pilot study

Postby TamsinD » Tue Apr 30, 2024 1:04 am

Thanks again for your help and quick feedback. I have added the prior from the travel time coefficient to the wait time. I also have a few more questions:
1. The mode constant is not significant and has a large standard error. Would you recommend changing it or leaving it as asc1[(n,-0.3,0.82)]?
2. I am using Survey Engine for the survey distribution. The maximum run time allowed for generating a Ngene design in Survey Engine is 60sec. For 1,000 sobol draws, would I need longer than 60s to run it fully? And if so, is there a way to over-ride the 60sec limit?
3. The design generation has a very large S estimate (291747551814.1842). Would it be useful to get the S estimate per parameter and if so how do I get this from Survey Engine? When I import the design into SurveyEngine the suggested minimum sample size is 720 (with 36 rows). How does it get to this value when it has such a large S estimate?
4. With the pilot study and the orthogonal design selected within Survey Engine, the attribute levels were not 'linear'. For example, journey times levels of (30,40,50) were input as (0,1,2) for the pilot. I then changed the journey times to 30,40,50 when I ran the MNL model. Is this OK for the pilot?

Here's the latest Ngene code incase it is helpful
esign
;alts = opt1, opt2
? efficient design
;eff = (mnl, d, mean)
;alg = swap
;rows = 36
;bdraws = sobol(1000)

;model:
U(opt1) = b1.dummy[(n,0.73,0.19)|(n,0.3,0.17)|(n,-0.01,0.17)] * x1[0,1,2,3]
        + b2[(n,-0.02,0.01)]                                  * x2[30,40,50]
        + b3[(n,-0.06,0.02)]                                  * x3[10,20]
        + b4[(n,-0.02,0.01)]                                  * x4[2,5,10]
        + b5[(n,-0.14,0.04)]                                  * x5[2,3,4,5,6]
        + b6.dummy[(n,0.26,0.07)]                             * x7[0,1]
        + asc1[(n,-0.3,0.82)]
/
U(opt2) = b7[(n,-0.03,0.02)] * x2_2[30,35,40]
        + b8[(n,-0.06,0.07)] * x5_2[1,2,3,4,5]
        + b9[(n,-0.11,0.01)] * x6_2[0,5,10,15]
        + b6                 * x7
$
TamsinD
 
Posts: 6
Joined: Tue Apr 23, 2024 9:00 pm

Re: Estimating baysian priors from a pilot study

Postby Michiel Bliemer » Tue Apr 30, 2024 11:35 am

1. I would probably leave the asc with a large standard error but you may want to use ;eff = (mnl,d,median) if you suspect that too extreme draws would be taken.

2. There is no way to over-ride the 60s limit in SurveyEngine, it is supposed to be a 'light' version of Ngene. When I run the script in the desktop version of Ngene I can see that after one minute the D-error is anout 0.064 but it goes down further when running longer to about 0.060 after about 10 minutes. Of course I encourage you to purchase the full version of Ngene :)

3. I do not know, you will have to ask SurveyEngine. When I run the script in the desktop version of Ngene I can see that the very large sample size estimate is because of this prior: (n,-0.01,0.17), which is very close to zero. Ngene produces a lot more output but SurveyEngine does not target power users and only shows a limited amount of information.

4. Yes, that would be fine.

Michiel
Michiel Bliemer
 
Posts: 1741
Joined: Tue Mar 31, 2009 4:13 pm

Re: Estimating baysian priors from a pilot study

Postby TamsinD » Wed May 01, 2024 11:08 pm

Thank you so much for your help. I will use ;eff = (mnl,d,median) and will see if I can purchase Ngene through the University.
I have three more questions:
1. It seems that the minimum sample size estimate from Survey Engine are calculated as = rows x 20/scenarios. Would I be better using the S estimate per parameter and ignore the prior that is close to 0 (n,-0.01,0.17) to calculate minimum sample size? The largest parameter estimate is 113, so would the minimum sample size be 113 and actual sample size required be 3 x 113? Then multiply this by number of block versions?
2. Is Rob.std.err. better to use than Std.err. ?
3. b_fuel_car is not significant in the MNL output , can I still use the MNL estimate and errors b8[(n,-0.06,0.07)] * x5_2[1,2,3,4,5] or should I divide it be 1.5, as it says in some of the posts. b8[(n,-0.03,0.04)] * x5_2[1,2,3,4,5]
Thanks again.
TamsinD
 
Posts: 6
Joined: Tue Apr 23, 2024 9:00 pm

Re: Estimating baysian priors from a pilot study

Postby Michiel Bliemer » Thu May 02, 2024 10:18 am

1. Yes and you need to multiply with the number of blocks.
2. Robust standard error often a better standard error when estimating models, in experimental design it is the asymptotic standard error that is used most. You can use both, the robust one is often slightly larger.
3. You can keep the estimate if you believe that the contribution to utility is appropriate (you can look at relative importance of attributes). If you used an orthogonal design or random design in your pilot study you may want to shrink the priors somewhat towards zero as these choice tasks tend to have more (weakly) dominant alternatives.

Michiel
Michiel Bliemer
 
Posts: 1741
Joined: Tue Mar 31, 2009 4:13 pm

Re: Estimating baysian priors from a pilot study

Postby TamsinD » Thu May 02, 2024 9:34 pm

Thank-you so much for your continued patience Michiel. It's all quite a steep learning curve!
Just to clarify:
1. If every survey participant answered one choice task question, I would need a sample size of 3 *113 and 452 participants. If I added 6 blocks, I would need a sample size of 452 * 6. However, if I ask each participant 6 choice task questions and have 6 blocks, then I would need 452 participants.
2. Thanks for this, I will continue with robust error
3. As I used an orthogonal design for the pilot, do you think it would be worth shrinking all the priors closer to zero or just the ones where the contribution does not seem appropriate ?

Thanks again.
TamsinD
 
Posts: 6
Joined: Tue Apr 23, 2024 9:00 pm

Re: Estimating baysian priors from a pilot study

Postby Michiel Bliemer » Fri May 03, 2024 9:44 am

1. I am a bit lost in all the numbers you provide, so I am just going to state how to calculate it. Ngene will provide you with a sample size estimate assuming that each respondent is given all choice tasks in you design, i.e. 36 in your case. If you block it in 9 parts, then you need to multiply the sample size estimates with 9. I am not exactly sure where the 3 comes from, maybe this is the number of scenarios.

3. I would shrink the priors somewhat towards zero yes, it is safer to have priors closer to zero rather than priors that are too far away from zero as that could mean that the attribute (or constant) could start dominating the choice with some draws from the distribution for that prior.

Good luck,
Michiel
Michiel Bliemer
 
Posts: 1741
Joined: Tue Mar 31, 2009 4:13 pm

Next

Return to Choice experiments - Ngene

Who is online

Users browsing this forum: No registered users and 9 guests