creating data from DCE survey

This forum is for posts covering broader stated choice experimental design issues.

Moderators: Andrew Collins, Michiel Bliemer, johnr

creating data from DCE survey

Postby Rajwane » Fri Jun 30, 2023 11:19 pm

Hello,
I have conducted a pilot DCE for 100 respondants in order to generate priors for the final DCE design and survey.
I have found research papers talking about coding variables, but in my DCE, I only have one categorical attribute, and the rest are numerical.
I have coded the data like this but I do not know if this is the correct way to do it or not.
I have 3 alternatives, and 8 choice sets, meaning that I should have 8x3 = 24 rows for each respondant
I added a "Choice " and "alternative" variable (columns). The aletrnative variable takes the values 1,2,3 (referring to alt1, alt2 and alt3) and the choice variable takes the value of 1 if the alternative is chosen by the respondant and 0 if not.

I then added the attributes (columns). I put the levels of attributes as in the DCE. Is that correct or should I be coding them ?

Is this the right way to transform the DCE survey results to a database ? I tried to add an image but I did not know how... so I hope I was clear enough.

I also have alternative specific attributes (for example Attributes 1 and 3)... does that change anything ? should put zeros or keep the cells empty ?

Thank you in advance !
Rajwane
 
Posts: 6
Joined: Fri May 26, 2023 7:17 am

Re: creating data from DCE survey

Postby Michiel Bliemer » Sat Jul 01, 2023 12:21 pm

The format of the database depends on the software that you are using for model estimation.

Nlogit uses the format that you describe, where each alternative is on a different row and you can put 0 for attributes that do not appear in an alternatives.
Biogeme and Apollo put all alternatives in the same row, e.g. price1,quality1, ..., price2, ... etc., so you need to give each attribute for each alternative a different name.

For numerical variables you can use their actual levels, e.g. for $80 you can use 80 and for 15 minutes travel time you could use 15 or 0.25 if you want to express in hours.

For categorical variables you need to code the categories, where dummy coding is easiest. Suppose that you have a variable called side_effects with levels None, Mild, Moderate, Severe. Then you need to choose one of these levels as the base level, say None. It does not matter which level you select as base level, but the coefficients of the other levels will all be interpreted as relative to the base level.

If you use Apollo or Biogeme, would can simply code these levels in your database as 0, 1, 2, 3 and then in your estimation script you would write in your utility function something like: U = ... + bmild * (side_effects == 1) + bmoderate * (side_effects == 2) + bsevere * (side_effects = 3), where you estimate parameters bmild, bmoderate, and bsevere, while the utility for the base level, none, is normalised to 0. So if bmild =-0.1 and bmoderate = -0.5 then mild side effects is slightly worse than no side effects, and moderate side effects is much worse than mild side effects (and no side effects).

Nlogit, you would create 3 columns: MILD, MODERATE, SEVERE
If the level is 0 (none), then you use values 0, 0, 0 for the 3 columns.
If the level is 1 (mild), then you use values 1, 0, 0 for the 3 columns.
If the level is 2 (moderate), then you use values 0, 1, 0 for the 3 columns.
If the level is 3 (severe), then you use values 0, 0, 1 for the 3 columns.

I hope this clarifies.

Michiel
Michiel Bliemer
 
Posts: 1733
Joined: Tue Mar 31, 2009 4:13 pm

Re: creating data from DCE survey and regression

Postby Rajwane » Tue Jul 18, 2023 7:12 am

Thank you for your reply.

Unfortunately, I do not have access to these softwares at the university. So I have coded the variables myself on Excel and then tried to do the regression on Stata.

This is a sample of the data that I have coded :
Image

So for individual nb 126, I have 24 rows (8 choice sets x 3 alternatives).
Each choice set is represented by 3 rows, meaning one row for each alternative in the choice set.
The variable "Choice" takes the value of 1 if the alternative is chosen by the repondent in the choice set.

Then the regression on Stata would be the following : mlogit Choice Alternative2 Alternative3 Attribute1 Attribute2 Attribute3 Attribute4 Attribute5 Attribute6

Is this correct ?

Thank you in advance !

Sincerely,
Rajwane
Rajwane
 
Posts: 6
Joined: Fri May 26, 2023 7:17 am


Return to Choice experiments - general

Who is online

Users browsing this forum: No registered users and 18 guests