Candidate set too large

This forum is for posts covering broader stated choice experimental design issues.

Moderators: Andrew Collins, Michiel Bliemer, johnr

Candidate set too large

Postby ndurn » Tue Jun 09, 2020 11:01 pm

Dear all,

We are working on a DCE to elicit preferences for a patient reported outcome measure. Although we have reduced as much as possible the profile measure, we still end up with 13 attributes with 4 levels each. To simplify choice task complexity we plan to use an overlapping design with 6 overlapping attributes and 7 non overlapping attributes. The overlapping design would be explicit i.e. we would still show the level for overlapping attributes.

We are attempting to create the external candidate set with overlapping attributes, as requested by Ngene. We understand the standard way forward for an attribute level overlap design would be to calculate a full factorial design (all possible choice pairs that might exist), limit this to a randomly selected % of the overall factorial design e.g. 1% and use this to improve the design modelling properties in ngene. However, our profile measure has more than 67 million possible states, which results in 67 million x 67 million. No commercial software (STATA has a limit of 2 billion) can handle such amount of observations.

We have thought of a solution. We could estimate a dataset with 67 million possible health states. We can then randomly select approximately 60000 of them, calculate based on them all possible pairwise tasks, remove dominant alternatives and non overlapping alternatives and use this as our candidate set (1 billion possible pairs). This approach has the limit of excluding a random large set of possible health state pairs before hand.

My questions:
1. Have you experienced a similar issue in the past and used a workaround?
2. Do you see any issues with our idea to solve the problem?

Kind regards and thanks in advance for any help or advice you can provide,
Nick
ndurn
 
Posts: 3
Joined: Mon Jun 08, 2020 7:00 pm

Re: Candidate set too large

Postby Michiel Bliemer » Wed Jun 10, 2020 10:55 am

Hi Nick,

When the candidate set is very large (which happens quite often), you generally need to select a fraction of the full factorial without first generating the full factorial (which is prohibitive to generate). There are many ways to do this, I generally write code in Matlab to do this in a more intellgent way by adding choice tasks to the candidate set one by one using a quasi-random process where I make sure that constraints and overlap are satisfied. Note that usually a candidate set of 5,000 to 10,000 choice tasks is more than enough to generate an efficient design (more variety is not needed), inclusion of many more choice tasks will make the algorithm very slow. We have done many tests and there is some efficiency gain in using 10,000 over 1,000 candidates, but there is generally little gain in using more than 10,000 candidates.

One process could be:

Step 1: Generate 5,000 random choice tasks imposing any constraints other than overlap constraints. For example, in Ngene:

design
;alts = alt1, alt2
;rows = 5000
;fact
;require:
...
;reject:
...

Step 2: Copy the 5,000 randomly generated choice tasks to Excel.

Step 3: In Matlab, R, or Python, create a matrix with 5,000 rows and 13 columns, in which each row has 7 zeros (for non-overlapping attributes) and 6 ones (for overlapping attributes). Then randomly shuffle each row. Perhaps this can also be done in Excel, and you could use macros, but I think that this is much easier outside Excel.

Step 4: Copy the matrix with zeros and ones next to the 5,000 choice tasks in Excel. Then in Excel, create an explicit partial profile design keeping the attribute levels in the choice tasks for each column with a zero, and fixed the attribute level for each column with a one (e.g. by simply copying the attribute value of the first alternative across to the other alternatives).

We aim to automate this process in the next version of Ngene.

Your suggestion is similar to mine, but I would not consider 67 million health states or 1 billion possible pairs, this is far too many (takes a lot of memory and computation time) and it is not necessary.

Michiel
Michiel Bliemer
 
Posts: 1727
Joined: Tue Mar 31, 2009 4:13 pm

Re: Candidate set too large

Postby ndurn » Thu Jun 11, 2020 5:57 pm

Dear Michiel,

Thanks very much for your response!

It is very useful to know that you're testing has shown little benefit going above 10,000 and very handy to see the approach and steps you have taken to tackle this previously.

Kind regards,
Nick
ndurn
 
Posts: 3
Joined: Mon Jun 08, 2020 7:00 pm


Return to Choice experiments - general

Who is online

Users browsing this forum: No registered users and 12 guests