choice-metrics.com

by **dtobin02** » Sat Sep 16, 2023 4:09 am

Hello,

I am a beginner NGENE user who is currently designing a choice experiment to understand farmers' preferences for different types of tree-planting (agroforestry) schemes. We are currently running a pilot (n = 40) with a first design and I hope to use the information we obtain to design a very efficient design that will take advantage of our prior knowledge and allow us to run mixed multinomial logit models (given that heterogeneity is extremely important in this area). My question is how can I best design for mixed multinomial logit analysis and use information from my pilot while not overcomplicating my code to the extent that it takes NGENE too long to run. Below is more information that may allow a helpful expert to give me some guidance:

First, here is the code that was used to design the pilot:

Design
;alts = alt1, alt2, alt3
;rows = 24
;block = 6
;eff = (mnl, d)
;model:
U(alt1) = b0[0.0001] + treemix.effects[0.0002 | 0.0001 | -0.0001] * treemix[0, 1, 2, 3] + downpay[-0.00002] * downpay[0, 10, 20, 30] + pattern.effects[0.0002 | -0.0002] * pattern[0, 1, 2] + pes.effects[0.0002 | 0.0001 | 0.00005] * pes[0, 1, 2, 3] + il1[0.0001] * pattern.effects[0] * treemix.effects[0] + il2[-0.0001] * pattern.effects[0] * treemix.effects[1] /
U(alt2) = b0 + treemix.effects * treemix + downpay * downpay + pattern.effects * pattern + pes.effects * pes
+ il1 * pattern.effects[0] * treemix.effects[0] + il2 * pattern.effects[0] * treemix.effects[1]
$

This code does not include the panel structure for mmnl which I hope to add, but has very small fixed priors which are meant to encode "sign" priors. I had 17 guided conversations with farmers using orthogonal cards and several conversations with field staff who had worked in this area for a long time to come up with these "sign" priors that boiled down to "farmers like timber trees, like fruit trees but less than timber, and dislike medicinal trees". As you can see, we have priors about combinations of pattern and species that will be desirable or undesirable (timber on boundaries are extra good and fruit on boundary are bad). We also will be presenting farmers with two choices per choice task and a "take none" / "no choice" option. Each farmer will see 4 cards.

My plan is to use the results from the 40-person pilot (which I will analyze with a multinomial logit model in NLOGIT) to improve the priors and create a design that allows us to detect heterogeneity across farmer type (because farmers at different distances from national park boundaries and farmers with different farm sizes are thought to have different preferences)

Here is my current code:

Design
;alts = alt1, alt2, alt3
;rows = 24
;block = 6
;eff = (rppanel, d)
;model:
U(alt1) = b0[0.0001] + treemix.effects[(n, 0.0002, 0.0001) | (n, 0.0001, 0.00005) | (n, -0.0001, 0.00005)] * treemix[0, 1, 2, 3] + downpay[n, -0.00002, 0.00001] * downpay[0, 10, 20, 30]
+ pattern.effects[0.0002 | -0.0002] * pattern[0, 1, 2] + pes.effects[0.0002 | 0.0001 | 0.00005] * pes[0, 1, 2, 3]
+ il1[0.0001] * pattern.effects[0] * treemix.effects[0] + il2[-0.0001] * pattern.effects[0] * treemix.effects[1] /
U(alt2) = b0 + treemix.effects * treemix + downpay * downpay + pattern.effects * pattern + pes.effects * pes
+ il1 * pattern.effects[0] * treemix.effects[0] + il2 * pattern.effects[0] * treemix.effects[1]
$

I am not sure if it is a problem, but this code already takes several hours to output any designs and I have only specified bayesian priors for treemix and downpay. Previously, I only specified a Bayesian prior (having it be a prior with a normal distribution rather than fixed prior) for downpayment and the code ran much faster. My questions are:

1. Can I and should I input Bayesian priors for the other effect levels and the interaction effects? Or is it sufficient to only add a Bayesian prior to downpayment?
2. Is "rppanel" the correct efficiency parameter for making designs to detect heterogeneous preferences?
3. Based on my study goals, do you all have any other suggestions? (I have not modified defaults for things like rdraw etc.).

I am using a rapid virtual machine with 40GB of ram, but I am constrained by the fact that the virtual machine restarts every 24 hours (at 6a). I can also run the code on my computer but I have 4 GB of ram (which I was told may be the reason it was taking so long to get results).

Thank you,
Danny

by **Michiel Bliemer** » Sat Sep 16, 2023 7:51 am

For your pilot study, it would suffice to use a script like below, where local uninformative near-zero priors are used to only indicate the preference order of the attribute levels such that Ngene can avoid dominant alternatives (I added an asterisk * to alt1 and alt2). Note that the LAST level is considered the reference level in Ngene, so keep this in mind when determining your effects coded priors.

Code: Select all: Design ;alts = alt1*, alt2*, optout ;rows = 24 ;block = 6 ;eff = (mnl, d) ;model: U(alt1) = b0[0.0001] + treemix.effects[0.0002 | 0.0001 | -0.0001] * treemix[0, 1, 2, 3] + downpay[-0.00002] * downpay[0, 10, 20, 30] + pattern.effects[0.0002 | -0.0002] * pattern[0, 1, 2] + pes.effects[0.0002 | 0.0001 | 0.00005] * pes[0, 1, 2, 3] + il1[0.0001] * pattern.effects[0] * treemix.effects[0] + il2[-0.0001] * pattern.effects[0] * treemix.effects[1] / U(alt2) = b0 + treemix.effects * treemix + downpay * downpay + pattern.effects * pattern + pes.effects * pes + il1 * pattern.effects[0] * treemix.effects[0] + il2 * pattern.effects[0] * treemix.effects[1] $

Once you have done your pilot study, you can use pilot parameter estimates as Bayesian priors, namely (n,mean,se), similar to the script that you had, see below.

Code: Select all: Design ;alts = alt1*, alt2*, optout ;rows = 24 ;block = 6 ;eff = (mnl, d, mean) ;bdraws = gauss(3) ;model: U(alt1) = b0[0.0001] + treemix.effects[(n, 0.0002, 0.0001) | (n, 0.0001, 0.00005) | (n, -0.0001, 0.00005)] * treemix[0, 1, 2, 3] + downpay[(n, -0.00002, 0.00001)] * downpay[0, 10, 20, 30] + pattern.effects[0.0002 | -0.0002] * pattern[0, 1, 2] + pes.effects[0.0002 | 0.0001 | 0.00005] * pes[0, 1, 2, 3] + il1[0.0001] * pattern.effects[0] * treemix.effects[0] + il2[-0.0001] * pattern.effects[0] * treemix.effects[1] / U(alt2) = b0 + treemix.effects * treemix + downpay * downpay + pattern.effects * pattern + pes.effects * pes + il1 * pattern.effects[0] * treemix.effects[0] + il2 * pattern.effects[0] * treemix.effects[1] $

A design that is optimised for MNL can also be used to estimate a (panel) mixed logit model. Optimising for rppanel is only possible for extremely small designs, it will be very time consuming in your case and not recommended. We found that there is usually not much benefit in optimising a design specifically for mixed logit because it is generally difficult to obtain priors for the randomly distributed coefficients and the gain in efficiency is marginal. See also: https://www.sciencedirect.com/science/article/pii/S0191261509001398.

Michiel

by **dtobin02** » Sat Sep 16, 2023 10:41 am

Thank you so much, Michiel! This perfectly answers my questions!

by **dtobin02** » Fri Oct 20, 2023 10:47 am

Dear Michiel -- Thank you again for your help. We have completed our pilot for the first 40 and I have adapted the code you sent by adding the coefficients and standard errors from the MNL model (in NLOGIT).

Based on your response to another recent post, I plan to shrink the non-significant results by a factor of 1.5 and otherwise fill the priors with the syntax (n, estimate, standard deviation). Similar to the other post, I have a question about how a researcher should respond to priors that are signed the opposite direction.

I ask this question because we have a negative sign on all the PES coefficients but they are being compared to a payment that pays the farmer 0 rupees in the first year (and the sum over three years is equal across all payment schemes). It seems odd that farmers would want to be paid later (0 rupees for surviving trees in year 1 versus 100 rupees per surviving tree in year 1) which makes me think that the card was not well-explained in the pilot. In situations such as these, is it okay to basically re-sign and re-arrange the priors to reflect what would be a consistent time-preference for money? Or is there a more defensible way to handle this type of situation?

There are other results that are somewhat surprising (which is good, seems like a point of research) -- and signed the opposite direction of our initial priors, but in those cases I would not change the prior from the coefficients because it is more plausible the farmers just have different preferences than we expected. I am just wondering if, for instance, a payment term was incorrectly signed after a pilot (or something that was a similar level of red flag) there was precedence for a researcher to reject that prior in favor of their initial prior or one the prior but with the opposite sign.

On another topic, I wanted to ask a further quick clarification on the code you suggested: what is the advantage of bdraws = gauss(3) over defaults? I think you have answered elsewhere that it is better when optimizing for mean in the eff parameter but I would appreciate some further clarification.

Thanks,
Danny

Design
;alts = alt1*, alt2*, optout
;rows = 24
;block = 6
;eff = (mnl, d, mean)
;bdraws = gauss(3)
;model:
U(alt1) = b0[0.0001]
+ treemix.effects[(n, 0.41245, 0.43507) | (n, 0.54994, 0.43941) | (n, -0.72356, 0.41280)] * treemix[0, 1, 2, 3]
+ downpay[(n, -0.00016, 0.01074)] * downpay[0, 10, 20, 30]
+ pattern.effects[(n, 1.60886, 0.48142) | (n, 0.82775, 0.34373)] * pattern[0, 1, 2]
+ pes.effects[(n, -0.92631, 0.39369) | (n, -0.44007, 0.37511) |
(n, -0.10571, 0.34345) ] * pes[0, 1, 2, 3]
+ il1[(n, -0.52139, 0.63552)] * pattern.effects[0] * treemix.effects[0]
+ il2[(n, -0.30354, 0.61717)] * pattern.effects[0] * treemix.effects[1]
/
U(alt2) = b0
+ treemix.effects * treemix
+ downpay * downpay
+ pattern.effects * pattern
+ pes.effects * pes
+ il1 * pattern.effects[0] * treemix.effects[0]
+ il2 * pattern.effects[0] * treemix.effects[1]
$

by **Michiel Bliemer** » Sat Oct 21, 2023 7:56 pm

Looking at your parameter estimates, and with the last level in Ngene being the base level (PES=3), the utility for PES=3 is 0.92+0.44+0.10=1.46.
The difference in utility between the worst level, PES=0 (-0.92), and the best level, PES=3 (1.46), is 2.38. This is quite large and I would be reluctant to simply reverse the sign without further investigation.

Did you apply effects coding correctly in model estimation?
Also, I notice that you have no parameter estimate for the constant b0, but you MUST estimate this coefficient if you have an opt-out alternative as otherwise all your parameter estimates are biased (possibly causing the issue).

Regarding your other question, usually Gaussian quadrature is best since it is the most accurate method. However, you have 11 Bayesian priors (excluding the constant, but the constant will also need a Bayesian prior), and 3^11 is very large. So in that case I recommend that you use ;bdraws = sobol(2000) or something. Sobol is better than the default Halton if you have a large number of parameters.

Michiel

by **dtobin02** » Mon Oct 23, 2023 9:55 pm

Hi Michiel,

Thank you very much for your reply. I am also reluctant to reverse the coefficients but what these priors are saying is that "holding payment total fixed" farmers prefer being paid later rather than earlier for surviving saplings. This is quite hard to understand given that, generally, people prefer money sooner, the value of money generally decreases over time (inflation), and the maximum number of surviving saplings will be greatest in the first year (subsequent years could have equal numbers of surviving trees but not greater because replanting is not allowed).

We are talking to our field team, but the reason for this finding could be due to the fact that the PES amounts are (100,200, 300), (200, 200, 200), (300, 200, 100), (0, 200, 400) for years 1-3 and so they might be seeing 400 and picking that one even though the total amount is a constant sum of 600. What do you think? Is it justifiable to reverse priors or set them close to zero? It is possible we should remove the 0, 200, 400 option if it is causing a result we do not believe.

Separately, I am super surprised about the interpretation of effects coding. Based on the color example (page 123 of ngenemanual130) I thought that effects coding was basically just like creating a series of dummy variables, but the left out level was uniquely set for each set of effects (rather than 0). I am surprised that you can add the parameter estimates to get the left out level. I would have thought that the interpretation is just PES == 3 gives 0.92 more utility than PES == 0, 0.10 more than PES == 2, etc. but these were essentially separate effects pegged to the same baseline left out level. Perhaps I am misunderstanding your suggestion...If I am not, what does this mean for calculating WTP? Can I no longer divide the effects coefficient by the monetary attribute coefficient to get an interpretation of "a farmer would be willing to pay X rupees to shift from PES scheme Y (left out) to PES scheme Z?"

On the other matters,
1) I have added the constant term (from the intercept from the MNL model)
2) I changed to Sobol draws (thank you for your explanation)

Thank you so much for your generous time,
Danny

Revised Code:

Design
;alts = alt1*, alt2*, optout
;rows = 24
;block = 6
;eff = (mnl, d, mean)
;bdraws = sobol(2000)
;model:
U(alt1) = b0[(n, 1.13864, 0.46010)]
+ treemix.effects[(n, 0.2749667, 0.43507) | (n, 0.3666267, 0.43941) | (n, -0.72356, 0.41280)] * treemix[0, 1, 2, 3]
+ downpay[(n, -0.0002, 0.00008)] * downpay[0, 10, 20, 30]
+ pattern.effects[(n, 1.60886, 0.48142) | (n, 0.82775, 0.34373)] * pattern[0, 1, 2]
+ pes.effects[(n, -0.61754, 0.39369) | (n, -0.29338, 0.37511) |
(n, -0.07047, 0.34345) ] * pes[0, 1, 2, 3]
+ il1[(n, -0.34759, 0.63552)] * pattern.effects[0] * treemix.effects[0]
+ il2[(n, -0.20236, 0.61717)] * pattern.effects[0] * treemix.effects[1]
/
U(alt2) = b0
+ treemix.effects * treemix
+ downpay * downpay
+ pattern.effects * pattern
+ pes.effects * pes
+ il1 * pattern.effects[0] * treemix.effects[0]
+ il2 * pattern.effects[0] * treemix.effects[1]
$

by **Michiel Bliemer** » Tue Oct 24, 2023 6:50 am

I cannot comment on whether what was shown to respondents was interpreted correctly. If you feel that the signs do not make sense, then you may want to consider using (near) zero priors as that is always a safe choice.

Effects coding is defined in the literature, so please have a look. In the Ngene manual it is explained on the page that you refer to. In the table you see -1, -1 for the base level, and in equation (7.19) you see that the base level is the (negative) sum of the coefficients of the other levels, which is how I calculated the utility of PES==3. Dummy coding is much easier to interpret than dummy coding, so perhaps you should use dummy coding as it leads to exactly the same behavioural model but with easier to interpret coefficients. Again, please confirm that you estimated the coefficients correctly since you should have used -1, -1, -1 as the variable values of the base levels, not just -1, 0, 0 or something. Using -1, -1, -1 ensures that the average utility across all levels is equal to zero, which is a hallmark of effects coding. In other words, the levels are normalised around zero, as you can see in the figure on page 123 in the Ngene manual.

Calculating WTP for categorical variables (that are dummy or effects coded) is done differently from numerical variables. Namely you need to take differences between coefficients of levels and divide that by the coefficient for price or cost. With 3 levels (0,1,2) you would calculate multiple WTP values, namely one for changing from level 0 to level 1, from level 1 to level 2, and from level 0 to level 2. WTP is no longer a single value. I again refer to the literature about calculating WTP with categorical variables.

Michiel

by **dtobin02** » Sat Oct 28, 2023 3:30 am

Thanks Michiel! I was misunderstanding effects coding. I will be sure to review the literature, but I think I am all clear now and I have a path forward with our implementation. Thank you again for all of your help!

by **dtobin02** » Fri Nov 03, 2023 11:21 pm

Hi Michiel -- you were correct that I had not been running the effects analysis correctly. I have since converted the reference level of all categorical variables to take on the values of (-1, 0) and the other levels to take on the values of (1, -1). This is correct?

My NLOGIT MNL code would be unchanged (I think) from how I ran it with dummies, but the data is different.

Two quick questions as we move to final design:
(1) How do I translate effects coded coefficients into priors for dummy variables if I want to switch my NGENE design to treat categorical variables as dummies instead of effects-coded variables? (would the be unchanged or do I need to do some conversion)
(2) Based on my reading, I understand that effects-coding has a theory advantage (more appropriate) for variables that have a utility ordering like (PES timing in which farmers get payment sooner or later on a scale that is not necessarily evenly spaced or corresponding to constant increases in utility / disutility. Based on this, I think it would make sense to run my design where PES timing is effects-coded but Tree species mix and Pattern (where there is no implicit ordering besides my priors) are dummies. Am I understanding the utility theory implications correctly?

Thank you again so much for your generous time and help! This will hopefully provide the data for my job market paper so I am really trying to get this right...(and had never used NGENE before this project)

by **Michiel Bliemer** » Sat Nov 04, 2023 9:13 am

With effects coding, the reference level will have values -1 for all variables, while for other levels it will be 1 for the associated variable and 0 for the other variables.

Let me give an example. Consider treemix in your study, which has 4 levels:

treemix.effects[(n, 0.2749667, 0.43507) | (n, 0.3666267, 0.43941) | (n, -0.72356, 0.41280)] * treemix[0, 1, 2, 3]

This means that in your utility function you will have 3 variables, namely:
treemix0
treemix1
treemix2
(treemix==3 is your base level since it is the last level specified in Ngene)

In your data, this means that you need to code the following:
Level 0: treemix0=1, treemix1=0, treemix2=0
Level 1: treemix0=0, treemix1=1, treemix2=0
Level 2: treemix0=0, treemix1=0, treemix2=1
Level 3: treemix0=-1, treemix1=-1, treemix2=-1

If you prefer to use Level 0 as the base level, you will need to use treemix[1,2,3,0] in your Ngene script.

Based on the means of your priors, you have the following utilities:
Level 0: 0.2749667
Level 1: 0.3666267
Level 2: -0.72356
Level 3: -(0.2749667)-(0.3666267)-(-0.72356) = 0.081967

If you want to convert this into dummy coding with Level 3 the base level, then you simply need to normalise Level 3 to zero, which means adding 0.081967 to all utilities:
Level 0: 0.3569
Level 1: 0.4485
Level 2: -0.6415
Level 3: 0

So this would translate in Ngene to:
treemix.dummy[0.3569 | 0.4485 | -0.6415] * treemix[0, 1, 2, 3]

For more information about dummy and effects coding, I refer to this article:
https://www.sciencedirect.com/science/article/pii/S1755534516300781

Michiel

choice-metrics.com

Designing for MMNL with priors

Designing for MMNL with priors

Re: Designing for MMNL with priors

Re: Designing for MMNL with priors

Re: Designing for MMNL with priors

Re: Designing for MMNL with priors

Re: Designing for MMNL with priors

Re: Designing for MMNL with priors

Re: Designing for MMNL with priors

Re: Designing for MMNL with priors

Re: Designing for MMNL with priors

Who is online