Question about coding some attributes and levels

This forum is for posts that specifically focus on Ngene.

Moderators: Andrew Collins, Michiel Bliemer, johnr

Question about coding some attributes and levels

Postby susiezhao » Thu Mar 09, 2023 1:04 pm

Dear Mr. Michiel Bliemer:

I have some questions on the coding of attributes and levels that I would like to ask.

1. Do I have to use Bayesian efficient design when using efficient design? I would like to know the process of a proper routine.
Can I empirically give utility values and ensure that the D error is less than 0.3 so that the generated experiments can be used directly to investigate and get the correct survey data?
Is the data obtained from a survey based directly on an experiment generated from empirical utility, without testing a small sample to obtain actual utility values, correct?
I would like to be more explicit.
If I use efficient design and MNL, I give the utility values empirically and generate the experiment and carry out the survey directly, without adjusting the utility to the actual situation. So is the survey data obtained in this way wrong?
Or, if an efficient design is used, a small sample must be tested before a more realistic experiment based on actual utility can be generated for the survey.

2. When writing code using efficient design, do I have to use actual values in addition to dummy attributes? Or can all attribute levels be replaced by 1, 2, 3, 0.

3. If my attributes travel time and travel cost have different levels depending on the travel distance, can I generate the required experiments in one step when writing the code, or do I have to do it in 2 separate sessions. I don't know how to write the code here. For example:

Context: Commuting distance (km): 5km 10km

Mode1 Mode2
Access & egress time (min) 3,4,5, 7,8,9
Waiting time (min) 2,4,6, 3,5,7
Travel time for 5 km (min) 20,30,40 30,35,40
Travel time for 10 km (min) 40,50,60 40,45,50
Travel cost for 5 km 6,8,10 2,3,4
Travel cost for 10 km 10,12,14 4,5,6

4. If my attribute levels are probabilities, say 10% off per ride, 20% off per ride, 30% off per ride, can I use the actual values 0.1, 0.2, 0.3 when writing the code? Or do I need to use the dummy values, 1, 2, 0.

5. If my attribute levels are probabilities, say 50% less, 50% more, can I use the actual values -0.5, 0.5 when writing the code? Or do I need to use the dummy values, 1, 0.

6. If my attribute levels are 0, greater than or equal to 1, can I use the actual values 0, 1 when writing the code? Or do I need to use the dummy values, 1, 0.

7. If my attributes are uncertain, how do I write them when writing code using efficient design? For example:

Mode1 Mode2
Travel time (min) 10,12,14,16,18 20,22,24,26,28
12,14,16,18,20 22,24,26,28,30
14,16,18,20,22 24,26,28,30,32
16,18,20,22,24 26,28,30,32,34
Corresponding probabilities 20%, 20%, 20%, 20%, 20% 20%, 20%, 20%, 20%, 20%
10%, 20%, 40%, 20%, 10% 10%, 20%, 40%, 20%, 10%
5%, 15%, 60%, 15%, 5% 5%, 15%, 60%, 15%, 5%
0%, 10%, 80%, 10%, 0% 0%, 10%, 80%, 10%, 0%

Can I code the levels using the middlemost value? Like [14,16,18,20]. Or do I have to use dummy values like 1, 2, 3, 0. Or do you have a better way of doing it?

I'm really sorry that I have a lot of questions. I asked people around me for answers to these questions, but I didn't get any. I would appreciate it if you could kindly answer them.
susiezhao
 
Posts: 9
Joined: Thu Mar 09, 2023 12:47 am

Re: Question about coding some attributes and levels

Postby Michiel Bliemer » Thu Mar 09, 2023 1:29 pm

Hi,

1. A typical approach is that you use an orthogonal design or an efficient design based on (near) zero priors for a pilot study. Then conduct a pilot study, estimate the parameters, and use the parameter estimates and standard errors as informative Bayesian priors to generate a Bayesian efficient design for the main study.

2. When using dummy coding, the actual categorical level is replaced with dummy variables. For example, "low", "medium", and "high" could be given levels 0, 1 and 2, or could be given levels 1, 2, 10, it does not matter. In the end, each level (except for the selected reference level) is replaced with a dummy variable that equals 0 or 1, no matter what levels you used. In other words, b1.dummy[...] * x1[1,2,0] is the same as b1.dummy[...] * x1[2,10,1]. As long as you know which level refers to what category, then it is fine. Note that the last level is always the reference level in Ngene.

3. I would probably generate separate designs for different distance classes. But if you want to use the same design, you could create a pivot design with absolute or relative pivots like -20%, 0%, 20%.

4. Probability is a numerical value, therefore you do not need to use dummy coding and you could use 0.1, 0.2, etc.

5. You need absolute levels, not relative levels. So 50% more is meaningless in the utility function without knowing the base value. For example, if a respondent has income $10,000 and a respondent could earn 10% more income when selecting an alternative, then the utility function would be 1.1*income. So you would have an interaction term between probabilities[0.5,1.5] and income to indicate 50% less and 50% more income.

6. If you create categories such as "=0", ">=1", then you need to use dummy variables.

7. That depends on what measures you want in your utility function. If you want to use the mean and standard deviation, you need to compute the mean and standard deviation for each of these series of travel times and associated probabilities. In the utility function you would use the means and standard deviations as levels, but what you show to the respondent would be the travel times and probabilities.

Given that you have so many questions, you may be interested in this course that will start next week, of which I am one of the instructors:
https://www.choicemodelling.academy/

Michiel
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm

Re: Question about coding some attributes and levels

Postby susiezhao » Thu Mar 09, 2023 1:58 pm

I appreciate your quick response. And I would like to ask a few more questions.

2. In response to your second reply, I would like to ask if my attributes are travel time and the levels are 50 minutes, 60 minutes and 70 minutes. Should I then use [50, 60, 70] when writing code using efficient design? If I use [0, 1, 2] is this correct?

5. On the fifth issue, I would like to be clearer. There is no base value for this indicator. It is "House price compared with your current house". The levels are "50% less" and "50% more". I don't know how to write the code.

7. On the seventh question, I am not sure what measure to take in the utility function either. My aim is to tell people that due to traffic congestion the travel time may be "10,12,14,16,18 minutes" or "16,18,20,22,24 minutes" and that the probability of the corresponding travel time may be " 20%, 20%, 20%, 20%, 20%" or "10%, 20%, 40%, 20%, 10%" or " 5 %, 15%, 60%, 15%, 5% " or "0%, 10%, 80%, 10%, 0%". I don't know how to code this?

I would appreciate it if you could continue to answer me. I will consider the courses you recommend.
susiezhao
 
Posts: 9
Joined: Thu Mar 09, 2023 12:47 am

Re: Question about coding some attributes and levels

Postby Michiel Bliemer » Thu Mar 09, 2023 2:34 pm

2. For an efficient design you need to use the actual levels of numerical attributes that you will use in model estimation. So you need to use 50, 60, 70. But you can use 0, 1, 2 for an orthogonal design.

5. You need to think how you would estimate this model. You can only estimate the model it if you know the value of the current house, so you need to ask this value somewhere in the survey as otherwise this attribute does not make sense. In the experimental design stage, you could use the average house value to interact with, or you could make a separate experiment for low house values and high house values.

7. You need to think about how you would estimate the model, and to be able to do that you need to decide on what measures you will include in the utility function. What I described is typically done for experiments in transport, with risky travel times that have a choice probability. If you want to use these travel times and probabilities directly in the utility function, you need to consider expected utility theory. Have a look in the literature, for example CARA and CRRA expressions for risky attribute levels. But Ngene cannot optimise for expected utility theory, only for models based on random utility theory.

Michiel
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm

Re: Question about coding some attributes and levels

Postby Michiel Bliemer » Thu Mar 09, 2023 2:52 pm

Regarding your last question, look at this article in which we did something very similar for including travel time unreliability with a series of travel times and probabilities.
https://www.sciencedirect.com/science/article/pii/S0968090X21001625

Another relevant paper is this:
https://link.springer.com/article/10.1007/s11116-021-10206-3

Michiel
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm

Re: Question about coding some attributes and levels

Postby susiezhao » Thu Mar 09, 2023 3:13 pm

I appreciate your quick response. And I would like to ask a few more questions.

5. I do ask people about current house price in the survey. But everyone's house price is different . I think they know their house price in their mind. So there is no joint RP house price in the SP experiment. It only states in the experiment whether the house price in this neighbourhood has increased by 50% or decreased by 50% compared to your current house price. I don't show the exact house price, but I think people know in their mind, after all, house price is different for everyone.

7. The uncertainty I refer to is like in this article.
https://www.sciencedirect.com/science/a ... 0X20306860
How should such uncertainty be coded when using efficient design? Or would an orthogonal design be more appropriate?

I would appreciate it if you could continue to answer me.
susiezhao
 
Posts: 9
Joined: Thu Mar 09, 2023 12:47 am

Re: Question about coding some attributes and levels

Postby susiezhao » Thu Mar 09, 2023 3:18 pm

I have one more question. Before I conduct my trial study, what range of d-errors do I need to ensure that the experiments I generate are considered OK?
susiezhao
 
Posts: 9
Joined: Thu Mar 09, 2023 12:47 am

Re: Question about coding some attributes and levels

Postby Michiel Bliemer » Thu Mar 09, 2023 3:27 pm

5. Yes exactly, people have their house price in mind when they make the decision. So it is important to know this value and you would typically ask for this value in the survey. Otherwise you cannot properly estimate the model. You need to distinguish between the design of the experiment, and the model you are going to estimate. Yes you can just show -50% or +50% in the survey, but when you estimate the model you will need to know the house value as otherwise you do not know what you are modelling. Is the house price increase $4,000 or $10,000? This depends on the house value. Which value would you use for model estimation? You can only calculate this value if you know the house value. It is possible to estimate values for -50% and +50%, but reviewers will likely tell you that this is not appropriate.

7. That paper uses cumulative prospect theory, it does not use random utility theory. Ngene can only generate efficient designs for models based on random utility theory. You could consider setting all priors to zero and generate a design based on prob1[0.1,0.2,...] * traveltime1[10,12,...] + prob2 * traveltime2 etc, and use constraints to ensure that the probabilities add up to one. Such a design would likely work for estimating a model based on cumulative prospect theory, but it may not be efficient. There does not exist any software that I am aware of that can optimise designs for estimating models based on cumulative prospect theory.

Michiel
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm

Re: Question about coding some attributes and levels

Postby Michiel Bliemer » Thu Mar 09, 2023 3:29 pm

D-errors are case specific, so there is no magic number that is good or bad. Some studies have D-errors of 0.01, other studies have D-errors of 0.7, and both could be very efficient. D-errors depend on the model type, the priors, and the utility functions.
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm

Re: Question about coding some attributes and levels

Postby susiezhao » Thu Mar 09, 2023 3:38 pm

susiezhao
 
Posts: 9
Joined: Thu Mar 09, 2023 12:47 am

Next

Return to Choice experiments - Ngene

Who is online

Users browsing this forum: No registered users and 10 guests