Hello NGENE Users,
I am working with a team on a stated preference survey featuring salmon recovery. We have 200+ choice situation responses from a pilot, and are refining the design in an effort to improve the coefficient estimates from the pending full survey mailing. We expect to have several hundred choice situation responses at the end of the study.
We have code running in NGENE, but want to verify it with expert NGENE users so that we can make the most of the pilot study results. We are also getting an error when trying to run some NGENE features.
To summarize our conceptual design, we have just 2 attributes, SalmonStatus, and Cost. Each choice situation has three options, a Status Quo option and two generic alternatives, 1 and 2. The Status Quo always holds a $0 for the Cost attribute, and a NoRecovery level for SalmonStatus (which means significant possibility of extinction). Neither of these SQ levels ever occur in alts 1 or 2. Besides SQ there are six additional levels SalmonStatus may take, which include number of expected "returns" of salmon to spawning area yearly:
1) SlowBasic (Recovery expected in 50 yrs, returns at 40k, and low extinction risk)
2) SlowHigh (Recovery expected in 50 yrs, returns at 70k, and low extinction risk)
3) MediumBasic (Recovery expected in 25 yrs, returns at 40k, and low extinction risk)
4) MediumHigh (Recovery expected in 25 yrs, returns at 70k, and low extinction risk)
5) FastBasic (Recovery expected in 15 yrs, returns at 40k, and low extinction risk)
6) FastHigh (Recovery expected in 15 yrs, returns at 70k, and low extinction risk)
Since a recovery timeline (50, 25, or 15 yrs) only makes sense in the case of recovery (Basic or High), we believe recovery time cannot be identified as a separate variable from salmon abundance. That is, relative to the SQ, you cannot increase abundance without also incurring a waiting time. Thus we arrived at the bundled categoric variable described above. Preference for abundance however will be estimable e.g. from the coefficient difference from 1) to 2), and time preference will be estimable e.g. from the coefficient difference from 1) to 5).
I have some code below, including priors from the pilot. This code does run and does return a design, but I have several questions-
• In our pilot mnl discrete choice model results we have a "combined" ASC for choosing either alt1 or alt2, with resulting coefficient of about -0.05. I believe I can put this either under the Utility specs for alt1 and alt2 as -0.05, or under the Utility spec for SQ as +0.05, which is how I have it here. Is that correct? I think I have to include some sort of U(SQ) in the model or NGENE will think SQ is a no-choice alternative rather than a SQ alternative.
• The design from the code below returns a lot of dominating options within a block. For example, the same level of SalmonStatus will show up at a different cost in one of the three choice situations. Or, within the same block, SalmonStatus 1) will show up at the same or higher price than SalmonStatus 2). Is there a way to prevent that from happening in NGENE? If it must be manually adjusted, we can go that route too, but are there any guidelines on how much impact on d-efficiency is "safe" as a result of that manipulation? I can also write several conditions to prevent dominance, but I would prefer not to enforce such conditions ACROSS blocks, just WITHIN blocks, is there a coding option for this?
• In an effort to reduce dominance in the design, if I try to include the "*" behind alt1 and alt2 on the second line of code, I get the following error:
A valid initial random design could not be generated after approximately 10 seconds. In this time, of the 375099 attempts made, there were 0 row repetitions, 17221 alternative repetitions, and 357878 cases of dominance. There are a number of possible causes for this, including the specification of too many constraints, not having enough attributes or attribute levels for the number of rows required, and the use of too many scenario attributes. A design may yet be found, and the search will continue for 10 minutes. Alternatively, you can stop the run and alter the syntax.
And then the program eventually aborts. I suspect this error when using the "*" is related to having just two attributes. However I am hoping to generate a highly efficient design using these attributes and something similar to the rows and block parameters in the code, and it seems like this is what NGENE is for, so I am wondering if my code just needs to be changed in some way. We do have flexibility to change the number of rows in the design, i.e. increase or decrease number of survey versions. We would like to keep three questions per block however.
Many thanks in advance for your thoughts and suggestions!
- Matt
Design
;alts = alt1, alt2, SQ
;rows = 18
;block = 6
? Level descriptions for Status are
? 1) SlowBasic; 2) SlowHigh;
? 3) MediumBasic; 4) MediumHigh;
? 5) QuickBasic; 6) QuickHigh;
? 7) NoRecovery
? Other variable is Cost
;eff = (mnl, d)
;model:
U(alt1) = b1.dummy[(n,0.9,0.25)|(n,0.95,0.25)|(n,1.0,0.25)|(n,1.05,0.25)|(n,1.1,0.25)|(n,1.2,0.25)] * Status[1,2,3,4,5,6,7] + b2[-0.004] * Cost[40,80,150,250,350] /
U(alt2) = b1 * Status + b2 * Cost /
U(SQ) = b0[0.05]
$