choice-metrics.com

by **acanakci** » Thu Jun 13, 2024 7:40 am

Hello,

I am working on a discrete choice experiment for my MSc studies and plan to use explicit or implicit partial profiles. My experiment involves 20 attributes, half of which are binary and the other half discrete valued.

In the manual section related to explicit partial profiles, it is mentioned that all combinations should be created externally, and a randomly selected subset should be fed into the software. For example, in the manual, 1,000 out of 13,608 combinations were randomly selected and provided to the software (p. 189/259).

Given the larger number of attributes and levels in my study, I estimate that there will be approximately 200,000 combinations. I have two main questions:

1- About Determining the Number of Randomly Selected Combinations: Based on my larger set of attributes, how should I determine the appropriate number of randomly chosen combinations to use?

2- About the Efficiency of Reduced Combinations: If I reduce the number of combinations to around 10,000; that is if I randomly choose 10,000 out of 200,000, will NGene still be able to provide an efficient design? Additionally, how is this "efficiency" measured?

I would greatly appreciate any guidance on these questions, as I am working under a tight deadline for my MSc studies.

Thank you very much in advance!

by **Michiel Bliemer** » Thu Jun 13, 2024 10:25 am

The modified Federov algorithm goes through your entire candidate set and swaps each row with a row in the current design. If you use a large candidate set it takes very long. The largest candidate set I have ever used had 10,000 rows, but I ran the script for a whole day. I tested with various candidate set sizes, and using a set larger than 2,000 rows only marginally improves upon the final efficiency (as measured for example by the D-error). So whilst you could use a very large candidate set, it does not really improve the efficiency much while it takes much longer to generate the design. The default in Ngene is therefore set to 2,000. I would recommend not using a set larger than 5,000 or 10,000.

Michiel

by **acanakci** » Tue Jun 25, 2024 10:59 pm

Hello,
Thank you very much for your clear reply!

Before we purchase the software, we have a couple of questions:

About Sampling the Dataset: We have an externally created dataset with around 180,000 rows (choice sets). Based on your experience, how many of these choice sets would you recommend sampling and providing to the software for optimal performance and statistical reliability?

About Constant Variables in Blocks: In a block of choice sets/questions, is it possible to keep certain attributes constant with the same levels among the choice sets while allowing other attributes to vary? As an example, we may consider that we would like attributes A, B, C, D, and E to remain constant within a block, while attributes F and G vary which would, in our view, further reduce the cognitive load on the respondents.

Thank you for your assistance!

by **Michiel Bliemer** » Wed Jun 26, 2024 9:20 am

A candidate set of 180,000 rows would be too many to feed into the modified Federov algorithm, I would randomly select at most 10,000 rows from this large set. You may need to run the script for a long time, perhaps overnight or even a full day. If you use 5,000 rows as a candidate set it will run twice as fast but may result in a slightly less efficient design.

Blocking does not affect design efficiency and for efficient designs is done after the design has been generated. You could manually create the blocks whereby you group the choice tasks whereby some of the attributes are fixed, but it may not always be possible to do it this way. I am also not sure that you want to keep certain attributes fixed for some respondents as you probably want all respondents to trade-off on all attributes within a block whereby in each choice task a different subset of attributes is overlapping. But I understand why you may want to do this, and this is not something that Ngene can automatically do for you. If you want to do this in Ngene, you could generate a separate design for each block where you only add the attributes that are varying in the utility function (the fixed ones are irrelevant), and then combine all the blocks to make up the complete design.

Michiel

by **acanakci** » Thu Jun 27, 2024 5:54 pm

Hello,

Thank you very much for your great help!

I have been working on creating an explicit partial profiles design as described on page 190 of the NGene manual, aiming to generate a dataset with 13,608 rows. However, I have encountered difficulties in achieving this exact number of rows with my Python script. This makes me worry about the accuracy of my code for the data I will provide to NGene.

I have assumed the levels for attributes as follows:

Prices: 3 levels (100, 150, 200)
Stars: 3 levels (1, 3, 5)
Distances: 3 levels (500, 1000, 1500)
Wifi: 2 levels (0, 1)
Breakfast: 2 levels (0, 1)
Pool: 2 levels (0, 1)
Could you please provide a detailed explanation of how the 13,608-row data for the explicit partial profiles design is formed?

Thank you!

by **Michiel Bliemer** » Fri Jun 28, 2024 8:07 am

This is the Matlab script I wrote to generate the 13,608 rows.
Essentially I go through all the possible combinations and if the combination has exactly 3 overlapping attributes (not more, not less) then i store it in the candidate set. Note that I have not yet removed dominant alternatives from this set, this happens automatically within Ngene when informative priors are specified.

Code: Select all: function A = createset() A = zeros(10000,12); i = 1; for price1 = [100,150,200] for stars1 = [1,3,5] for dist1 = [500,1000,1500] for wifi1 = [0,1] for breakfast1 = [0,1] for pool1 = [0,1] for price2 = [100,150,200] for stars2 = [1,3,5] for dist2 = [500,1000,1500] for wifi2 = [0,1] for breakfast2 = [0,1] for pool2 = [0,1] count = 0; if price1 == price2 count = count + 1; end if stars1 == stars2 count = count + 1; end if dist1 == dist2 count = count + 1; end if wifi1 == wifi2 count = count + 1; end if breakfast1 == breakfast2 count = count + 1; end if pool1 == pool2 count = count + 1; end if count == 3 A(i,:) = [price1, stars1, dist1, wifi1, breakfast1, pool1, price2, stars2, dist2, wifi2, breakfast2, pool2]; i = i + 1; end end end end end end end end end end end end end

Michiel

choice-metrics.com

Explicit Partial Profiles with High Number of Attributes

Explicit Partial Profiles with High Number of Attributes

Re: Explicit Partial Profiles with High Number of Attributes

Re: Explicit Partial Profiles with High Number of Attributes

Re: Explicit Partial Profiles with High Number of Attributes

Re: Explicit Partial Profiles with High Number of Attributes

Re: Explicit Partial Profiles with High Number of Attributes

Who is online