Page 1 of 1

Question about dummy variables and number of scenarios

PostPosted: Wed Aug 07, 2019 6:25 pm
by shelly_bz
Dear board members
I am new to Ngene so any help will be highly appreciated.
I am trying to generate a Bayesian de-efficient design for a mode choice task between three alternatives. I have no priors.
I have three attributes-
time, which has 4 levels,
cost, 3 levels
and passengers, which has six levels (justyou,man,woman,2men,2women,mix). Passengers is only relevant for one alternative while time +cost are relevant for all three.

1. How should I define the "passengers" attribute using Bayesian design? Does this code seems alright? I have no guess regarding the sign.
+ b13[n,0,0.05]*justyou[0,1] +b14[n,0,0.05]*oneman[0,1]+b15[n,0,0.05]*onewoman[0,1]+b16[n,0,0.05]*2men[0,1]+b17[n,0,0.05]*2women[0,1]+b18[n,0,0.05]*mix[0,1]/

2. How many scenarios should I generate? I thought that 12 rows should be enough as I have 7 betas for my attributes, two ACC's. So a total of nine, but I do want it to be dividable by 6,4,3. So total of 12. Am I doing this right?
3. I will be using a panel where each respondent is faced with six choice tasks. For that reason I will estimate an error component model. Do I need have to count the error components in, as attributes for the number of scenarios?

Thank you in advance,
Shelly

Re: Question about dummy variables and number of scenarios

PostPosted: Thu Aug 08, 2019 10:49 am
by Michiel Bliemer
1. Please see Section 7.2.8 of the Ngene manual where dummy and effects coding is explained. In your case, this means:
b1[(n,0,0.5)|(n,0,0.5)|(n,0,0.5)|(n,0,0.5)|(n,0,0.5)]*passenger[1,2,3,4,5,0]
where 0 is the reference level, e.g. you can choose justyou or mix as the reference and you estimate 5 coefficients.
Note that I have used round brackets around the Bayesian priors, otherwise you get a mixed logit model with fixed priors, which is something entirely different.

2. The number of choice tasks S needs to satisfy S*(J-1)>=K, where J is the number of alternatives and K is the number of parameters to estimate (see page 57 of the manual). If K=9 and J = 3, then in theory S>=5. So 12 choice tasks would work, which also allow attribute level balance in your case since it is divisible by 3, 4, and 6. Often it is a good idea to have a bit more variation in your data, so I would choose 24 choice tasks and block the design in 2 versions using ;block = 2, which means you show 12 choice tasks to each respondent but you have two versions of your survey. If you only want to show 6 choice tasks per respondent, then you can block it further, e.g. ;block = 4.

Michiel

Re: Question about dummy variables and number of scenarios

PostPosted: Thu Aug 08, 2019 5:23 pm
by shelly_bz
Dear Michiel.
This helps so much. Thank you!
One more question.
Say I want one of my levels to appear in 50% of scenarios ("Justyou"). Is there a way to do that?
Thank you again

Re: Question about dummy variables and number of scenarios

PostPosted: Fri Aug 09, 2019 9:31 am
by Michiel Bliemer
Yes that can be done. You will need to use the modified Federov algorithm, which does not aim to satisfy attribute level balance. Simply add:
;alg = mfederov

Further, you need to tell Ngene that the level for "justyou" should appear 50 per cent of the time. Assuming that reference level 0 is "justyou", you would get (assuming for example 24 choice tasks in the design):
;rows = 24
;model:
U(..) = ... + b1[(n,0,0.5)|(n,0,0.5)|(n,0,0.5)|(n,0,0.5)|(n,0,0.5)]*passenger[1,2,3,4,5,0](1-24,1-24,1-24,1-24,1-24,12)

where I set the number of appearances for the first level to anywhere between 1 and 24, while the last level ("justyou") should appear 12 times. This may be a bit restrictive, so you could change this to 10-14 or another range.

Michiel

Re: Question about dummy variables and number of scenarios

PostPosted: Sun Aug 11, 2019 6:09 pm
by shelly_bz
Thank you so much. It helped a lot.
So I tried running my entire code:

Design
;alts=REG,PAV,SAV
;rows=24
;block =4
;alg = mfederov
;rdraws = Halton(2000)
;eff=(mnl,d)
;cond:

if(SAV.costsav = 11.00, PAV.costpav = [12.00,15.00],

if(REG.timereg = 30.00, PAV.timepav=[24.00,30.00]






;model:


U (REG)=
b1[n,0,0.5]+
b3[n,-0.0001,0.5]*costreg[7.00,10.00,13.00]+
b6[n,-0.0001,0.5]*timereg[30.00,36.00,42.00,48.00]/

U (PAV)=
b2[n,0,0.5]+
b4[n,-0.0001,0.5]*costpav[9.00,12.00,15.00]+
b7[n,-0.0001,0.5]*timepav[24.00,30.00,36.00,42.00]/

U (SAV)=
b5[n,-0.0001,0.5]*costsav[5.00,8.00,11.00]+
b8[n,-0.0001,0.5]*timesav[21.00,30.00,39.00,48.00]+
b9[(n,0,0.5)|(n,0,0.5)|(n,0,0.5)|(n,0,0.5)|(n,0,0.5)]*passenger[1,2,3,4,5,0](1-24,1-24,1-24,1-24,1-24,10-14)

$

I got this error message:
"Error: The 'model' property contains a prior that has dummy or effects coding without an appropriate suffix. 'b9'"

I tried using passengers.dummy instead of just passengers, it didn't work.
Am I missing anything?

Is my code looking alright except that?

Thank you again, I appreciate your assistance very much.
Shelly

Re: Question about dummy variables and number of scenarios

PostPosted: Sun Aug 11, 2019 9:02 pm
by Michiel Bliemer
A few things that need changing:

1. Replace b9 with b9.dummy
2. Replace conditional constraints with ;reject constraints since conditional constraints are incompatible with the mfederov algorithm:

;reject:
sav.costsav = 11.00 and pav.costpav = 9.00,
reg.timereg = 30 and pav.timepav = 36.00,
reg.timereg = 30 and pav.timepav = 42.00

Now you syntax will run.

Michiel

Re: Question about dummy variables and number of scenarios

PostPosted: Mon Aug 12, 2019 5:24 pm
by shelly_bz
Thank you so much. Indeed it worked. However, I got a huge s estimate and B estimate.
Here are my output parameters:
D error 0.093845
A error 1.001498
B estimate 99.999879
S estimate 11913661.610246

Here is the final syntax I used. Is it the use of too many bayeasian priors?


Design
;alts=REG,PAV,SAV
;rows=24
;block =4
;alg = mfederov
;rdraws = Halton(2000)
;eff=(mnl,d)


;reject:
sav.costsav = 11.00 and pav.costpav = 9.00,
reg.timereg = 30.00 and pav.timepav = 36.00,
reg.timereg = 30.00 and pav.timepav = 42.00


;model:


U (REG)=
b1[n,0,0.5]+
b3[n,-0.0001,0.5]*costreg[7.00,10.00,13.00]+
b6[n,-0.0001,0.5]*timereg[30.00,36.00,42.00,48.00]/

U (PAV)=
b2[n,0,0.5]+
b4[n,-0.0001,0.5]*costpav[9.00,12.00,15.00]+
b7[n,-0.0001,0.5]*timepav[24.00,30.00,36.00,42.00]/

U (SAV)=
b5[n,-0.0001,0.5]*costsav[5.00,8.00,11.00]+
b8[n,-0.0001,0.5]*timesav[21.00,30.00,39.00,48.00]+
b9.dummy[(n,0,0.5)|(n,0,0.5)|(n,0,0.5)|(n,0,0.5)|(n,0,0.5)] * passenger[1,2,3,4,5,0](1-24,1-24,1-24,1-24,1-24,10-14)

$



Thank you very much

Re: Question about dummy variables and number of scenarios

PostPosted: Mon Aug 12, 2019 6:10 pm
by Michiel Bliemer
You should ignore S estimates (and B estimates) because they only are meaningful when you use realistic priors. Setting your priors to a very small value means you get very large S estimates and therefore you can simply ignore them. Once you have estimated parameters from a pilot study and use them as priors, the S estimates will be meaningful.

Michiel

Re: Question about dummy variables and number of scenarios

PostPosted: Mon Aug 12, 2019 7:30 pm
by shelly_bz
Thank you for the clarification.
Something about the choice situations is not right:

Design
Choice situation reg.costreg reg.timereg pav.costpav pav.timepav sav.costsav sav.timesav sav.passenger Block
1 7 48 9 42 5 48 5 1
2 13 36 15 42 11 48 0 4
3 13 30 9 24 5 48 0 3
4 13 48 15 24 11 48 0 2
5 7 36 9 42 5 48 0 1
6 7 30 9 24 5 21 0 1
7 13 48 9 42 5 21 0 2
8 13 48 15 24 11 21 0 2
9 13 30 15 24 11 48 5 1
10 7 48 15 42 11 21 0 4
11 13 48 15 42 11 21 4 4
12 7 48 15 24 11 21 2 1
13 13 30 9 24 5 48 2 3
14 7 48 15 24 5 21 5 3
15 7 36 15 42 11 21 1 3
16 7 48 15 24 5 48 4 4
17 7 48 9 24 5 48 0 4
18 13 48 15 42 5 21 0 1
19 13 48 15 42 5 48 1 2
20 7 30 9 24 5 21 1 4
21 13 30 9 24 5 21 4 3
22 7 30 12 24 11 48 3 2
23 13 48 9 42 5 48 3 3
24 7 30 15 24 5 21 3 2

Some of the levels are not represented at all. For instance, reg.costreg originally has three levels (7,10,13) but only two are represented (7,13). The same goes for all attributes. Why?

Thank you for all your help.

Re: Question about dummy variables and number of scenarios

PostPosted: Mon Aug 12, 2019 8:24 pm
by Michiel Bliemer
The modified Federov algorithm does not guarantee attribute level balance (see my earlier comment). You are asking for a D-efficient design and it is most efficient to only use the outer levels. If you want all levels to be used, you either need to use dummy/effects coding for that attribute or you need to impose attribute level constraints as you did for the passenger attribute.

Michiel
(Please note that this forum is not a replacement of the manual, most answers can be found in the manual and should be the first place to look for answers)