choice-metrics.com

by **rheinber** » Mon Dec 05, 2011 6:42 am

Hi Ngeners,
I just read Bliemer&Rose (Transp Res A, 2011) in which you compare estimates from SC surveys based on three experimental designs: an orthogonal design with 108 choice situations blocked into 18 x 6 choice tasks; a D-efficient design with 108 choice situations also blocked into 18 x six choice tasks; a second D-efficient design with 18 choice situations blocked into 3 x 6 choice tasks. While the paper nicely describes the differences between models estimated based on the orthogonal or the efficient designs, it is silent about the conceptual differences in the two efficient designs. I understand that from an empirical point of view you hardly found any difference between models estimated based on one or the other efficient design. But do you have any theoretical reasons to advise for the smaller efficient design? A priori, it seems safer to base a CE on a larger design as the risk of behaviorally dominant designs (i.e. dominance through attribute ignorance) is smaller in the larger design.

best,
Chris

by **johnr** » Tue Dec 06, 2011 8:33 pm

Hi Chris

An interesting question with an interesting answer. Each choice task will provide a certain amount of statistical information to the Hessian of the design, the negative inverse of which is the (co)variance matrix. In the MNL (or conditional logit) model, you can calculate the choice task specific contribution to the Hessian (in most cases - there are a few exceptions - well perhaps more than a few which complicates things considerably but stay with me anyway) with the final Hessian simply being the sum over these individual contributions. Other models work in a similar fashion, though the panel MMNL model is slightly more complicated. Now think of a very small design in which you have say only 20 possible choice tasks from which you want to select only 5. Ideally, what you would like to do is calculate the contribution each task would make to the overall efficiency of the design, rank them and then choose the best 5.

Now what would happen if you wanted to select 10 out of the 20 instead of 5? Theoretically, each new choice task would add statistically less information than the previous, so that the 10th choice task provides less benefit to the statistical efficiency of the design than the 9th choice task. Hence, adding choice tasks 6-10 should in theory provide less information than the original 1-5 choice tasks. Hence, there is a diminishing return to adding more choice tasks. At some limit, which is likely to be context specific, you will only have 'rubish' choice tasks left that add nothing to the design efficiency.

What the paper you cite found was that in this particular example context, the 18 choice task design provided as much information as the 108 choice task design. That is, the first 18 choice tasks provided a huge amount of information whereas the remaining 90 choice tasks added little or nothing. That is not to say that the 108 choice task design might have beat the pants off of the 18 due to the fact that we used an efficient and not optimal design in this case. Unless you check every single choice task, then you will never be certain that you have selected the best S choice tasks. Note that the size of the typical problem (the one in the paper was not large by any stretch of the imagination) and the exceptions make it impossible to calculate all the individual specific Hessians and hence, we have to rely on algorithms to locate a good design.In this case, there were literally hundreds of billions of different combinations that could exist, hence we cannot be 100% certain that we selected the very best 18/108 choice tasks. We are confident that we let the program run long enough to get very good efficient designs in both cases however.

Hope this answers your question.

John

choice-metrics.com

size of efficient designs

size of efficient designs

Re: size of efficient designs

Who is online