Generating and Evaluating the Designs
Posted: Sun Jun 26, 2022 9:44 am
Dear Ngene team,
While I have been generating choice experiment designs using Ngene, I have many questions that I hope someone can help with. Thank you in advance for your time and help.
In my current design, there are three labelled alternatives (a battery electric truck, a hydrogen truck, and a diesel truck as a status quo alternative) and the following seven attributes, each with their specific attribute levels:
Among these attributes, some are with a single-level for each alternative. For example, the "emissions" is 0% for battery electric and hydrogen options and 100% for diesel option without any variations. When I include these single-level attributes in the utility specification, I found that Ngene could not generate an efficient design. Once I remove these single-level attributes, it worked. So, my first questions is:
Q1. Typically, should an attribute with a single level be excluded from the utility specification?
My second question is about orthogonal designs. While I was able to generate efficient designs, Ngene couldn't generate an orthogonal design with an error message saying "Warning: One or more attributes will not have level balance with the number of rows specified: bev.pcostz, bev.offsitez, hfcev.pcostz, hfcev.ocosth, hfcev.rangeh, hfcev.offsitez. [Orthog] No design found." Here is the code that I used.
Q2. So, I wonder if the certain numbers of attribute levels in this experiment would inherently cause level unbalance and lead to the failure of generating orthogonal design. Do you happen to know about the reasons of the failure? Also, are there any ways to correct this code?
The next two questions are about efficient design. I tried zero priors, fixed priors, and random priors. For fixed and random priors, I set their values and distributions based on some assumptions while there is very limited knowledge in literature. Here's an example of the code that I used for an efficient design.
And, here is a summary of resulted statistical efficiencies across different prior settings.
Q3. D-errors and A-errors look quite small for all these prior settings. When comparing the random prior types A and B, I wonder how meaningful is an improvement of D-error from 0.0000436 to 0.0000136?
Q4. (In the case of efficiencies being just comparable between different prior settings) Aside from D-error, A-error, B-estimate, and S-estimate, are there any other things to consider when selecting a final design?
Meanwhile, I feel unsure about these prior assumptions although I discussed this with other researchers. There is too limited literature in this area. I might then need to do a pilot survey to obtain more reliable priors. So, the next two questions are about a pilot survey.
Q5. I'd expect around 50 to 100 responses for a full survey at best, given an unpromising response rate in this study area. Then, how many responses should be targeted for a pilot survey?
Q6. For a pilot survey, which design would be better to be used, between zero prior vs. fixed prior vs. random prior vs. orthogonal?
Now, I have a few more questions regarding other settings in the design.
Q7. I used 40 rows and 5 blocks in this experiment (i.e., 8 choice tasks per respondent). When I tried a smaller number of rows, D-error tended to increase, and B-estimate tended to decrease (i.e., a status quo alternative being more dominant). Here's a summary of the results. Could you give me any suggestions on the number of rows and blocks?
Q8. For all attributes (b2 to b6) except for alternative-specific constant (b1 and b7), I used generic parameters to make the model simpler (i.e., a smaller number of parameters) given that a relatively small sample size would be expected. Do you any suggestions regarding the number of parameters given this small sample (e.g., 50-100)?
Q9. I also wonder if it'd be okay to use alternative-specific parameters for some of the attributes, when estimating the model, in case I will be able to get a larger number of responses (e.g., 300+)?
These are the questions that I currently have. Sorry if this is too many. And I'd greatly appreciate it if you could share with me any of your knowledge, experiences, and thoughts regarding these questions!
Thank you again,
Youngeun
While I have been generating choice experiment designs using Ngene, I have many questions that I hope someone can help with. Thank you in advance for your time and help.
In my current design, there are three labelled alternatives (a battery electric truck, a hydrogen truck, and a diesel truck as a status quo alternative) and the following seven attributes, each with their specific attribute levels:
- Purchase costs
- Operating costs
- Driving rangeĀ
- Emissions
- Shortest distance to an off-site fueling/charging station
- On-site infrastructure construction costs
- Refueling/charging time
Among these attributes, some are with a single-level for each alternative. For example, the "emissions" is 0% for battery electric and hydrogen options and 100% for diesel option without any variations. When I include these single-level attributes in the utility specification, I found that Ngene could not generate an efficient design. Once I remove these single-level attributes, it worked. So, my first questions is:
Q1. Typically, should an attribute with a single level be excluded from the utility specification?
My second question is about orthogonal designs. While I was able to generate efficient designs, Ngene couldn't generate an orthogonal design with an error message saying "Warning: One or more attributes will not have level balance with the number of rows specified: bev.pcostz, bev.offsitez, hfcev.pcostz, hfcev.ocosth, hfcev.rangeh, hfcev.offsitez. [Orthog] No design found." Here is the code that I used.
- Code: Select all
Design
? orthogonal sequential
;alts = BEV, HFCEV, DSL
;rows = 40
;orth = seq2
;model:
U(BEV) = b1 +
b2*pcostz[105,110,115,125,150,175,200] +
b3*ocostb[50,70] +
b4*rangeb[100,200,300,500] +
b5*offsitez[10,20,60] +
b6*onsiteb[0,25,50,75,100] /
U(HFCEV)= b7 +
b2*pcostz +
b3*ocosth[90,115,130] +
b4*rangeh[300,500,700] +
b5*offsitez +
b6*onsiteh[0,25,50,75,100] /
U(DSL) = b2*pcostd[100] +
b3*ocostd[100] +
b4*ranged[700] +
b5*offsited[5]
$
Q2. So, I wonder if the certain numbers of attribute levels in this experiment would inherently cause level unbalance and lead to the failure of generating orthogonal design. Do you happen to know about the reasons of the failure? Also, are there any ways to correct this code?
The next two questions are about efficient design. I tried zero priors, fixed priors, and random priors. For fixed and random priors, I set their values and distributions based on some assumptions while there is very limited knowledge in literature. Here's an example of the code that I used for an efficient design.
- Code: Select all
Design
? efficient design
;alts = BEV, HFCEV, DSL
;rows = 40
;block = 5
;eff = (mnl, d)
;rdraws = halton(200)
;alg = swap (stop = total(10100 iterations))
;model:
U(BEV) = b1[n, 0.4800, 0.1600] +
b2[n, -0.0300, 0.0100]*pcostz[105,110,115,125,150,175,200] +
b3[n, -0.0800, 0.0267]*ocostb[50,70] +
b4[n, 0.0120, 0.0040]*rangeb[100,200,300,500] +
b5[n, -0.0700, 0.0233]*offsitez[10,20,60] +
b6[n, -0.0700, 0.0233]*onsiteb[0,25,50,75,100] /
U(HFCEV)= b7[n, 1.4500, 0.5000] +
b2*pcostz +
b3*ocosth[90,115,130] +
b4*rangeh[300,500,700] +
b5*offsitez +
b6[n, -0.0700, 0.0233]*onsiteh[0,25,50,75,100] /
U(DSL) = b2*pcostd[100] +
b3*ocostd[100] +
b4*ranged[700] +
b5*offsited[5]
$
And, here is a summary of resulted statistical efficiencies across different prior settings.
- Types ............ | D-error | A-error | B-estimate | S-estimate
------------------------------------------------------------------------------
Zero prior .......| 3.20E-05 | 7.50E-05 | 100 | 0
Fixed prior ......| 4.37E-04 | 1.24E-03 | 3.276 | 1.562
Random prior A | 4.36E-04 | 1.23E-03 | 2.955 | 1.552
Random prior B | 1.36E-04 | 9.15E-04 | 18.549 | 0.667
Q3. D-errors and A-errors look quite small for all these prior settings. When comparing the random prior types A and B, I wonder how meaningful is an improvement of D-error from 0.0000436 to 0.0000136?
Q4. (In the case of efficiencies being just comparable between different prior settings) Aside from D-error, A-error, B-estimate, and S-estimate, are there any other things to consider when selecting a final design?
Meanwhile, I feel unsure about these prior assumptions although I discussed this with other researchers. There is too limited literature in this area. I might then need to do a pilot survey to obtain more reliable priors. So, the next two questions are about a pilot survey.
Q5. I'd expect around 50 to 100 responses for a full survey at best, given an unpromising response rate in this study area. Then, how many responses should be targeted for a pilot survey?
Q6. For a pilot survey, which design would be better to be used, between zero prior vs. fixed prior vs. random prior vs. orthogonal?
Now, I have a few more questions regarding other settings in the design.
Q7. I used 40 rows and 5 blocks in this experiment (i.e., 8 choice tasks per respondent). When I tried a smaller number of rows, D-error tended to increase, and B-estimate tended to decrease (i.e., a status quo alternative being more dominant). Here's a summary of the results. Could you give me any suggestions on the number of rows and blocks?
- Settings .......... | D-error | A-error | B-estimate | S-estimate
------------------------------------------------------------------------------
5 blocks, 40 rows | 4.36E-04 | 1.23E-03 | 2.955 | 1.552
4 blocks, 32 rows | 5.21E-04 | 1.46E-03 | 1.338 | 2.008
3 blocks, 24 rows | 6.99E-04 | 1.81E-03 | 1.090 | 2.654
2 blocks, 16 rows | 9.64E-04 | 2.79E-03 | 0.764 | 3.997
Q8. For all attributes (b2 to b6) except for alternative-specific constant (b1 and b7), I used generic parameters to make the model simpler (i.e., a smaller number of parameters) given that a relatively small sample size would be expected. Do you any suggestions regarding the number of parameters given this small sample (e.g., 50-100)?
Q9. I also wonder if it'd be okay to use alternative-specific parameters for some of the attributes, when estimating the model, in case I will be able to get a larger number of responses (e.g., 300+)?
These are the questions that I currently have. Sorry if this is too many. And I'd greatly appreciate it if you could share with me any of your knowledge, experiences, and thoughts regarding these questions!
Thank you again,
Youngeun