Page 1 of 1

On pivot designs and conditions

PostPosted: Thu Nov 05, 2020 11:53 pm
by jbas
Hi all, this is my first post in this forum. Thanks for having me in this community and congrats for this wonderful software.

I have a couple of questions related to Pivot designs and conditions that I (unsuccessfully) tried to solve by myself.
My design is actually pretty simple, it contains three alternatives (car, bike, walk) and they have in common one attribute, travel time. I use the travel time by car as a reference for the bike and walk travel times, which seems to work. In addition, I add some conditions regarding another attribute that bike and walk have in common, which also seems to work. The code is as follows:

Code: Select all
Design

;alts = Car, Bike, Walk, None
;alg = swap(stop=total(3500 iterations))
;rows = 24 
;block = 4
;eff=(mnl, d, mean)

;cond:
if(Bike.lts_bike = 1, Walk.lts_walk = [1,2]) ,
if(Bike.lts_bike = 2, Walk.lts_walk = [1,2,3]) ,
if(Bike.lts_bike = 3, Walk.lts_walk = [2,3,4]) ,
if(Bike.lts_bike = 4, Walk.lts_walk = [3,4])

;model:
U(Car)  = b1[(u,-0.025, -0.0080)] * tt_car.ref[4,6,8] +
              b2                                * tc_car[0.3] +
              b3[(u,-0.0088, -0.0036)]* pc_car[0,1,3] /

U(Bike)  = b1                              * tt_car.piv[0%,25%,50%] +
           b5                                  * lts_bike[1,2,3,4] /

U(Walk)  = b1                            * tt_car.piv[100%,125%,150%] +
           b7                                 * lts_walk[1,2,3,4]

$


However, I finally opted for a heterogeneous pivot design, i.e. three different designs for three different respondent segments (in this case, people that travel short, medium, and long trips). Well, in this case, when I run it, I obtain the message Error: An attribute, 'walk.lts_walk', specified in the ';cond' property could not be found. Code is as follows:

Code: Select all
Design

;alts(short) = Car, Bike, Walk, None
;alts(medium) = Car, Bike, Walk, None
;alts(long) = Car, Bike, Walk, None

;alg = swap(stop=total(3500 iterations))

;rows = 24 
;block = 4

;eff= fish(mnl, d,mean)
;fisher(fish)= design1(short[0.33], medium[0.43],long[0.24])


;cond:
if(Bike.lts_bike = 1, Walk.lts_walk = [1,2]) ,
if(Bike.lts_bike = 2, Walk.lts_walk = [1,2,3]) ,
if(Bike.lts_bike = 3, Walk.lts_walk = [2,3,4]) ,
if(Bike.lts_bike = 4, Walk.lts_walk = [3,4])

;model(short):

U(Car)  = b1[(u,-0.025, -0.0080)] * tt_car.ref[6] +
              b2                                * tc_car[0.3] +
              b3[(u,-0.0088, -0.0036)]* pc_car[0,1,3] /

U(Bike)  = b1                     * tt_car.piv[0%,25%,50%] +
               b5                     * lts_bike[1,2,3,4] /

U(Walk)  = b1                     * tt_car.piv[100%,125%,150%] +
                b7                     * lts_walk[1,2,3,4]

;model(medium):

U(Car)  = b1[(u,-0.025, -0.0080)] * tt_car.ref[14] +
              b2                                * tc_car[1] +
              b3[(u,-0.0088, -0.0036)]* pc_car[0,1,3] /

U(Bike)  = b1                     * tt_car.piv[25%,50%,75%] +
               b5                     * lts_bike[1,2,3,4] /

U(Walk)  = b1                     * tt_car.piv[200%,250%,300%] +
                b7                     * lts_walk[1,2,3,4]

;model(long):

U(Car)  = b1[(u,-0.025, -0.0080)] * tt_car.ref[17] +
              b2                                * tc_car[1.5] +
              b3[(u,-0.0088, -0.0036)]* pc_car[0,1,3] /

U(Bike)  = b1                     * tt_car.piv[50%,65%,75%] +
               b5                     * lts_bike[1,2,3,4] /

U(Walk)  = b1                     * tt_car.piv[300%,400%,500%] +
                b7                     * lts_walk[1,2,3,4]

$


At this point, I have three doubts regarding this design:
1. Why the conditional clauses are not working in the second case?
2. I’ve noted that, although I define levels for the car travel time in the first case, the design always assigns the lowest value (that happens also in the heterogeneous case, so I fixed them). Is it not possible to define levels for the reference?
3. This is a more general and conceptual question. I wonder about the possible problems of data correlation in this type of designs. Obviously, all travel times will be almost perfect correlated since two of them are calculated from another one. Although the coefficients produced by the model will supposedly have lowest possible standard error, could not highly correlated data be an impediment to the significance of the parameters?


Thanks for your help.

Re: On pivot designs and conditions

PostPosted: Fri Nov 06, 2020 11:09 am
by Michiel Bliemer
1. Conditional constraints do not work in conjunction with multiple designs using the fisher command, see page 214 of the manual. Algorithms that can handle such complexity currently do not exist. I suggest that you create a design for each of the model categories separately using separate syntax.

2. I am not sure I understand your question. A reference level is derfined as a fixed level. If you require most reference levels, then you simply need to create multiple designs. I generally recommend creating a library of designs, using different reference levels and simply picking the appropriate design for each respondent based on their reference levels. Note that in this case, you do not need to use .ref and .piv, but you can use the actual attribute levels that you show to the respondents. This makes creating designs much easier.

3. No, in many cases, correlations help in reducing standard errors and therefore obtain more reliable parameter estimates. Only when correlations are very high (e.g. 0.95 or 0.99), identifiability issues may arise and it may no longer be possible to estimate the parameters. But as long as there is no perfect correlation, correlations are fine and will always happen (especially in revealed preference data).

Michiel

Re: On pivot designs and conditions

PostPosted: Fri Nov 06, 2020 10:52 pm
by jbas
Thanks for your response, Michiel, it’s been of great help. Nevertheless, I still have a couple considerations:

1. I understand your suggestion, but then I wonder if three separated designs (one per category) will be ‘equivalent’ to one only design that contain the 3 categories. I mean, when the three submodels are computed in one design, all the attribute levels are considered in order to maintain its properties. But if the 3 are generated separated, then the result of one is not taken into account in the others. I hope I’m explaining myself.
2. I get your point. My question was: even though knowing that it can be done with different designs as you suggest, can the reference be one out of several values instead of a fixed one, and then the pivoting attribute just pivot on it? For instance, instead of tt_car.ref[6], be tt_car.ref[4,6,8]; and still tt_car.piv[0%,25%,50%]? I would find this very convenient for presenting choice tasks in which I don’t want to always show the same travel time by car, but still bike and walk travel times linked to it.
3. This is an interesting discussion. In my opinion, in a pivot design the pivot and references (walk, bike, and car travel times in this case) will be almost perfectly correlated since ones are an exact calculation from the other. How cannot the correlation be superior to 0.95? Actually, if I check them in the design output this is precisely the case, as shown below:

Code: Select all
                        
Attribute   car.tt_car   car.tc_car   car.pc_car   bike.tt_car   bike.lts_bike   walk.tt_car   walk.lts_walk   Block
car.tt_car   1   1   0.730297   0.986928   0.912871   0.99591   0.923186   0.912871
car.tc_car   1   1   0.730297   0.986928   0.912871   0.99591   0.923186   0.912871
car.pc_car   0.730297   0.730297   1   0.702731   0.675   0.714683   0.716337   0.666667
bike.tt_car   0.986928   0.986928   0.702731   1   0.882919   0.984711   0.917192   0.900937
bike.lts_bike   0.912871   0.912871   0.675   0.882919   1   0.897352   0.960735   0.833333
walk.tt_car   0.99591   0.99591   0.714683   0.984711   0.897352   1   0.904087   0.909137
walk.lts_walk   0.923186   0.923186   0.716337   0.917192   0.960735   0.904087   1   0.84275
Block   0.912871   0.912871   0.666667   0.900937   0.833333   0.909137   0.84275   1



Thanks a lot for your time answering my questions and congratulations once again for this great software.

Re: On pivot designs and conditions

PostPosted: Sat Nov 07, 2020 11:51 am
by Michiel Bliemer
1. Yes you will lose some efficiency but this is likely not much. Either way, it is not possible to add conditional constraints when creating a heterogeneous design this way, so I am not sure what other option you have if you need constraints.

2. You cannot specify tt_car.ref[4,6,8] but you can do the following:

* specify tt_car_ref[4,6,8] (i.e., specifying it as a regular attribute)
* specify tt_car_piv[4,5,6,7.5,8,9,10,12] (i.e., as a regular attribute with pivots around the reference)
* add conditional constraints, e.g. if(tt_car_ref = 4,tt_car_piv=[4,5,6])

3. The whole point of experimental design is to make sure that attributes are NOT perfectly correlated. So with pivots, even through it is based on a reference alternative, there is still variation of 0-50% with the pivots around the reference value, more than enough to be estimate to estimate all coefficients. Note that in revealed preference data correlations are often about 80-90%, e.g. travel time and travel cost are generally very highly correlated, but this still allows one to estimate coefficients for travel time and travel cost because there is sufficient variation.

Michiel

Re: On pivot designs and conditions

PostPosted: Mon Nov 09, 2020 9:18 pm
by jbas
Thanks for your advice on points 1 and 2. Regarding 3, not that I want to insist, but the following design, with three values for the reference and high variations for the pivots, provides correlations (H index) among them of 0.98 and 0.99. Couldn’t this be really a problem?

Code: Select all
Design
;alts = Car, Bike, Walk, None
;alg = swap(stop=total(3500 iterations))

;rows = 24
;block = 4

;eff=(mnl, d,mean)

;cond:
if(Bike.lts_bike = 1, Walk.lts_walk = [1,2]) ,
if(Bike.lts_bike = 2, Walk.lts_walk = [1,2,3]) ,
if(Bike.lts_bike = 3, Walk.lts_walk = [2,3,4]) ,
if(Bike.lts_bike = 4, Walk.lts_walk = [3,4])

;model:

U(Car)  = b1[(u,-0.025, -0.0080)]    * tt_car.ref[4,6,8] +
              b2                                   * tc_car[0.3] +
              b3[(u,-0.0088, -0.0036)]   * pc_car[0,1,3] /

U(Bike)  = b1                   * tt_car.piv[0%,25%,50%] +
               b5                    * lts_bike[1,2,3,4] /

U(Walk)  = b1                   * tt_car.piv[100%,125%,150%] +
               b7                    * lts_walk[1,2,3,4]

$


Code: Select all
                  
Correlations (H Index)                        
Attribute   car.tt_car   car.tc_car   car.pc_car   bike.tt_car   bike.lts_bike   walk.tt_car   walk.lts_walk   Block
car.tt_car   1   1   0.730297   0.986928   0.912871   0.99591   0.923186   0.912871
car.tc_car   1   1   0.730297   0.986928   0.912871   0.99591   0.923186   0.912871
car.pc_car   0.730297   0.730297   1   0.716245   0.666667   0.72731   0.6742   0.666667
bike.tt_car   0.986928   0.986928   0.716245   1   0.8649   0.978341   0.917192   0.900937
bike.lts_bike   0.912871   0.912871   0.666667   0.8649   1   0.890618   0.960735   0.833333
walk.tt_car   0.99591   0.99591   0.72731   0.978341   0.890618   1   0.888763   0.909137
walk.lts_walk   0.923186   0.923186   0.6742   0.917192   0.960735   0.888763   1   0.84275
Block   0.912871   0.912871   0.666667   0.900937   0.833333   0.909137   0.84275   1



I take advantage of this reply to also comment on the efficiency measures, below. The S estimates, Sp and Sb estimates look particularly high. Any insight on this?

Code: Select all
                  
MNL efficiency measures                  
                                              
      Bayesian            
   Fixed   Mean   Std dev.   Median   Minimum   Maximum
D error   0.090726   0.090742   0.000211   0.090733   0.090393   0.091144
A error   1.50337   1.503625   0.000571   1.50344   1.502847   1.505118
B estimate   99.429913   99.376303   0.341497   99.425986   98.70528   99.858712
S estimate   14284.367738   17182.704106   9088.735989   14178.886251   7143.189584   41285.791294
                  
Prior   b1   b2   b3   b5   b7   
Fixed prior value   -0.0165   0   -0.0062   0   0   
Sp estimates   176.267155   Undefined   14284.367738   Undefined   Undefined   
Sp t-ratios   0.147629   0   0.016399   0   0   
Sb mean estimates   236.638434   Undefined   17182.704106   Undefined   Undefined   
Sb mean t-ratios   0.147892   0   0.016441   0   0   



Thanks again for your time.

Re: On pivot designs and conditions

PostPosted: Tue Nov 10, 2020 1:56 pm
by Michiel Bliemer
You should not compute correlations for attributes that have a fixed level, that does not make sense, you should compare correlations across two non-fixed variables. The highest Pearson-Product moment correlations between two varying attributes is 0.85, which is fine. When correlations are too high it becomes impossible to estimate the parameters and the D-error will become very large, so you would immediately pick up on this.

Note that .ref[4,6,8] does not work as I indicated in my previous email, .ref needs to have a single value and now it is fixed to the first value (4).

S-estimates only make sense if you use reliable priors. Your priors are very small and essentially indicate that none of your attributes actually have a large impact on choice. Make sure that your pilot study data coding (from where I assume you have obtained priors) is consistent with the coding you use in Ngene. If priors do not come from a sufficiently large pilot study then S-estimates should be ignored as they become unreliable.

Michiel

Re: On pivot designs and conditions

PostPosted: Tue Nov 10, 2020 8:22 pm
by jbas
Great, thanks for your insights. This conversation helped me to understand much better the issue of correlation.

I’d like to ask one last thing. Following the rationale that one can make a design based on a code scheme instead of based on the actual levels (where [-1,0,1] may mean $1, $2, $3), I coded the following design as a very initial approach to my project:

Code: Select all
Design

;alts = Car, Bike, Walk, None
;alg = swap(stop=total(3500 iterations))

;rows = 24 
;block = 4

;eff=(mnl, d,mean)

;cond:
if(Bike.lts_bike = 1, Walk.lts_walk = [1,2]) ,
if(Bike.lts_bike = 2, Walk.lts_walk = [1,2,3]) ,
if(Bike.lts_bike = 3, Walk.lts_walk = [2,3,4]) ,
if(Bike.lts_bike = 4, Walk.lts_walk = [3,4])

;model:

U(Car)  = b1[(u,-0.025, -0.0080)]    * tt_car[-1,0,1] +
              b2                                   * tc_car[-1,0,1] /

U(Bike)  = b3                   * tt_bike[-1,0,1] +
                b5                        * lts_bike[1,2,3,4] /

U(Walk)  = b6                   * tt_walk[-1,0,1] +
                b8                   * lts_walk[1,2,3,4]

$


It was my intention to substitute, once the output was generated, tt_car[-1,0,1] by 4,6,8; as well as tt_bike[-1,0,1] by 0%, 25%, 50%, and so on. My question is (leaving aside that .ref needs a single value, which I didn’t know at that moment): would that be equivalent to a pivot design in which the tt_car.ref was set to 4, and tt_car.piv[0%,25%,50%] (for tt_bike)? To what extend the efficiency and correlations would be affected by the fact the we are working with a coding scheme that ‘hides’ a pivot design?

Thanks.

Re: On pivot designs and conditions

PostPosted: Tue Nov 17, 2020 8:40 pm
by Michiel Bliemer
Your priors must make sense on the coding scheme you use in estimation. So if your priors were estimated using $1, $2, and $3 then you need to use levels 1, 2, and 3 in your design as well.

So the best process would be:

1. Use the actual levels that you would use in your data set for estimation, i.e. 4,6,8 minutes and 1,2,3 dollars.
2. Generate a design using these actual levels.
3. Convert these levels to pivots for your survey instrument, i.e. assuming a reference level of 6 minutes and 2 dollars, travel time becomes [-33%, 0%, 33%] and travel cost becomes [-50%, 0%, 50%]. Or use absolute pivots.
4. Capture the reference levels for the respondent in your survey instrument, e.g. 8 minutes and 1 dollar.
5. Compute the pivot levels to show to this respondent, i.e. [5.36, 8, 10.64] minutes and [0.50, 1, 1.50] dollars.
6. Round the pivot values to reduce cognitive burden on respondent, so for travel time you may want to use [5, 8, 11] minutes.

Michiel

Re: On pivot designs and conditions

PostPosted: Thu Nov 19, 2020 7:50 pm
by jbas
Great. Thanks for your time, Michiel.