by johnr » Tue Jul 08, 2014 11:32 pm
Hi JorienThanks for your question. I don't think that you need to worry about significant biases from the design. I always estimate ASCs for unlabelled experiments and in the vast majority of cases have found them to be statistically significant, independent of the design used (and I use a combination of orthogonal and efficient designs in my day to day work). Constants in SP experiments are by and large meaningless, particularly when unlabelled experiments are used. They reflect the average unobserved effect related to a set of hypothetical alternatives, where respondents observe multiple tasks when in reality they see typically only one in real markets. They are really only an artefact of the experiment and although they may reflect in part preference for a labelled alternative, they include a whole bunch of other unknowns. Hence, you would typically calibrate them or in the case of an unlabelled experiment ignore them completely. Note you still need to estimate them as they are accounting for something. Given the above, it is not clear to me what you mean by using a generic constant across the two alternatives. Personally, I would use the ASCs in estimation to account for differences in average error, but then ignore them in any post application of the results.
Re the probabilities, early work in efficient designs promoted things such as utility balance and minimum overlap (e.g. Huber and Zwerina 1996). Whilst these papers were pioneering in their day, we now know that these types of constraints only constrain efficiency and of all the things you might want to avoid, utility balance is probably the biggest one when it comes to statistical efficiency. Indeed, an (D-) efficient design will attempt to produce 0.7/03 probabilities in binary logit models (see Kaninnen 2002). You also don't want dominant alternatives - this implies infinite scale in that task, which can in econometric terms, lead to what is called model separation. As per above, 0.7/0.3 seems to be a sweet spot.
Do designs cause bias? There are two answers to this - yes and no! The argument for no: Asymptotically, one should be able to retrieve the population parameter estimates irrespective of the design. Many famous people actually use random designs. You also have RP data which we estimate the same models on, but seem to not ever comment on, though the same issues must exist for such data. In this respect, the work of McFadden 1974 which everyone cites, but few seem to have read, suggest that even in finite samples, the model will retrieve the parameters (the paper is quite extensive and covers a lot of material!). So orthogonal versus efficient design, should not impact upon the estimates (after accounting for scale) - the impact should be only on the standard errors (this is mostly what Michiel and I found in out 2011 paper where we compared empirically several design types). The argument for yes: there is evidence for what are known as design artefacts - this is not specific to just SP experiments, but relates to any survey really. In SP terms, it might be possible that a design will induce certain types of behaviour. In one example, I was interested in dating choices. I used a Street and Burgess design for a pilot amongst my (much younger) colleagues and found some funny results. One of the attributes was whether the potential partner was a single parent (or not). Given the design type, one alternative in the design always a single parent and the second always not. It turns out that (at least amongst my colleagues), it would be acceptable to date an axe murdering, chain smoking, alcoholic, neo Nazi psychopath, as long as they don't have kids. Because there was no attribute level overlap, in my small sample, they choose only on this attribute and the design allowed them to do this. Is this a problem - again yes and no. It might be that if I scaled the sample up to the population, I might have found the same preferences, in which case, any design should have found this same result. Or it might be that I had a biased pilot sample (young, single, and obviously not too discerning in who they are prepared to date) but over the population, there would be preferences for other attributes that this design (any design would pick up), given that the estimates are population averages anyway.
Hope this helps.
John