choice-metrics.com

by **BaibaP** » Tue Jul 21, 2015 7:30 am

Dear moderators and users,

I am turning to you with a problem of counterintuitive pilot results, namely, unexpected signs of parameters.

Following points briefly describe the project.
1. The survey aims to find valuations of travel time reliability.
2. D-efficient design for MNL was created for 2 utility functions with model averaging property.
3. Both functions have 4 common attributes (mean waiting time, in-vehicle time, travel cost, maximum possible waiting time), and 2 different attributes describing reliability.
4. The attributes are not independent. Some of them are functions of other attributes, which are not part of utility functions. This was implemented by supplying to Ngene a list of all allowed choice alternatives.
5. Three designs were created - to be given to short/medium/long travel time respondents. Medium and long design yielded small S-errors (<20 resp.), but the best design for the short travel time gave high S-error (>1000 resp.). Data from all segments are pooled for the estimation.
6. A pilot of 40 respondents was carried out. No counterintuitive behaviour was observed - the respondents either traded-off the attributes or used different lexicographic rules.

The problem is: the signs of some parameters are positive, which is unexpected (for mean waiting time or travel cost or several parameters depending on the utility function used).
All negative estimates can be obtained only, if mean waiting time is excluded.
But this is not acceptable for the intended usage of the results. Moreover, when conducting the survey it seemed that mean waiting time was evaluated as negative.

My question is: do you suspect there is a fundamental problem with my design?
Or could it be a matter of sample size, in which case I should keep interviewing additional respondents for the pilot until all (or at least most important) parameters have the expected signs?

If necessary, I will be happy to provide any more information.

Many thanks in advance and best regards,
Baiba

by **BaibaP** » Wed Aug 05, 2015 2:08 pm

Dear all,

A gentle reminder on this.

I would greatly appreciate any help or suggestions for this case.

Baiba

by **Michiel Bliemer** » Mon Aug 10, 2015 3:45 pm

It could be that mean travel time is correlated with maximum travel time. We often see that it is difficult to estimate both a mean travel time and a travel time unreliability because of their correlations (if maximum travel time usually goes up with mean travel time). I would suggest keeping mean travel time in and including an attribute called something like buffer time, which is: buffer time = maximum travel time - mean travel time. This buffer time should be less correlated with mean travel time, and may allow you do estimate the coefficients with expected signs.

by **johnr** » Mon Aug 10, 2015 5:39 pm

Hi BaibaP

The approach you describe has been used previously. One potential issue is with scale and possibly endogeneity- in effect you have omitted different variables in both data if I am understanding you correctly.

Endogeneity

for each data set, different omitted variables are in error terms, E. If there are interaction terms with four common attributes and these omitted variables, then they are also potentially in the V as well as in E. This may be problematic if you have omitted them from the different data sets, which means that a linear additive utility function between V + E might not be appropriate. Unless you can find some proxies for these, you cannot test for this however.

Scale

Given that you have different omitted variables, it is possible that you have different error variances - and covariances present in the models. This may be also causing some issues.

john

by **BaibaP** » Tue Aug 11, 2015 10:15 pm

Many thanks to the authors of both responses!

The suggestion of Prof. Bliemer improved the situation greatly.
Now all the parameters are negative.

For the points raised by johnr - yes, I have different omitted variables for each function (but both functions are estimated from the same responses).
I am not sure if I understood correctly. The utility function does not contain any interaction terms between omitted variables and common attributes. However, there might be interactions in the actual responses, which are not modeled, in which case probably both V of common attributes and E are influenced by the exclusion of attributes.

Probably a proof that this is a problem is the estimates of the common attributes being very different in both models. For example, beta_travel_cost/beta_in-vehicle_time=0.56 for one function and 0.08 for the other. If I want to create a design with 2 utility functions and model averaging (same as for the pilot) for the main survey, how would you suggest to treat the common attributes? Should I take the pilot results as priors directly?

Many thanks in advance,
Baiba

by **johnr** » Wed Aug 19, 2015 9:43 am

Hi Baiba

Your problem is not an uncommon issue when pooling data such as SP-RP data where you may have different subsets of attributes. Firstly, you need to treat this as a traditional data pooling exercise where you account for both scale and preference difference in the choices. You can do this using a NL model approach or you can do it using more advanced models, however you need to take care in doing this as there is quite a lot of misinformation regarding how to handle scale in these more advanced models within the literature. Handling scale is vital, particularly given that you have different subsets of attributes appearing in the error terms of your two data sets which may be impacting on scale differently. However, this shouldn't affect the mWTP estimates which is what you mention below, however the are a number of reasons why it might.

Firstly, your title says this is a pilot, so perhaps the sample size is insufficient to retrieve the population parameter estimates. Hence, it is possible that adding a few more respondents you will see that the estimates are still bouncing around. Further, you can compute WTP confidence intervals, and whilst the values you quote seem very different, are they statistically different?

Secondly, your comment is precisely what I meant. If there really are interactions between the existing attributes and the omitted variables, then these will enter both the systematic and unobserved components of utility differently. For example, if you have price and quality in one data set, but only price in the second, and there are truly interaction effects between both price and quality, then you can tease this interaction effect out for the first data, but not for the second. For the second, the price*quality interaction exists only in the error term, and price in both the V and E terms the later via the omitted interaction). Hence V and E are no longer orthogonal and you should not assume U = V + E. The problem is that this is an empirical problem and may or may not be the issue. As you have omitted different variables from different data, the only thing you can do is find some sort of instrument that will proxy for the omitted variables and use these either via some form of control function, or via a Hybrid SEM approach (there are other approaches possible also, however these are the two simplest in my mind).

In terms of your question, should you use the priors directly, there is no real or right answer to the question you ask. If you want to proceed with the same approach, then perhaps, how stable are the priors? That said, this might be the best information you have available. Further, as long as the priors are mostly in the right direction and proportional, then you should be okay.

John

by **BaibaP** » Tue Aug 25, 2015 5:51 pm

Hi John,

Thank you for the valuable suggestions.
By some modifications I was finally able to get approximately proportional estimates for the common attributes in both models.
Therefore I plan to proceed with the direct implementation of the priors for the final study.
But I take note that the NL approach should be considered for the estimation of the parameters from the final survey data to come up with a unified set of parameters from both functions (if I understand correctly).

However, there is an additional issue. The estimated cost parameter from the pilot is small: the maximum possible contribution to utility is about 10 times smaller than for other attributes.
Additionally, another implemented (RP) study from the same area informs that the relationship beta_cost/beta_travel time is 5 times larger than the one obtained from this pilot.
I imagine 3 possibilities to deal with this:
1) increase the range for the cost attribute,
2) decrease the range of other attributes,
3) assume a larger prior for the cost.

The 1) option is not a good option, because it would make the choice situations unrealistic.
Therefore the choice for me is between 2) and 3).
For the pilot the cost parameter was taken as suggested by that other study (large), and it was felt to be reasonable, because approximately similar number of people regarded the cost difference as negligible/considerable/decisive.
Therefore I am inclined for the 3). But do you think it is acceptable to blow up prior for a single parameter to ensure the balance of utility contributions (having another study that supports that prior change)?

Your suggestion for this question would be greatly appreciated!

Thank you and best regards,
Baiba

by **johnr** » Mon Sep 07, 2015 9:49 am

Hi Baiba

There are as you suggest a few possible reasons for the cost parameter being as it is. it may be sample size or it may be hypothetical bias or it may be a range effect. If you look at the data, you may find that many respondents always choose the alternative with the highest cost, meaning they didn't trade it in the SP experiment (a form of lexicographical rule perhaps). If this is the case, increasing the range of the attribute might help. You may also wish to make use of some of the methods that have been used within the literature to mitigate hypothetical bias such as certainty scales or cheap talk (amongst other methods). If it is a sample size issue, you may have to assume your prior from the other source (you don't have to use the priors from your pilot - you can mix them from multiple sources - or even use a model average approach where you use priors from different sources and weight the AVC matrix over the different priors).

John

choice-metrics.com

How to use unexpected pilot results? (EDITED)

How to use unexpected pilot results? (EDITED)

Re: How to use unexpected pilot results? (EDITED)

Re: How to use unexpected pilot results? (EDITED)

Re: How to use unexpected pilot results? (EDITED)

Re: How to use unexpected pilot results? (EDITED)

Re: How to use unexpected pilot results? (EDITED)

Re: How to use unexpected pilot results? (EDITED)

Re: How to use unexpected pilot results? (EDITED)

Who is online