Coding for categorical vs continuous attributes

This forum is for posts that specifically focus on Ngene.

Moderators: Andrew Collins, Michiel Bliemer, johnr

Coding for categorical vs continuous attributes

Postby melanie » Thu Feb 04, 2021 12:53 pm

Hi there,

We have developed the below code but are having uncertainties which it would be great if you could clarify:

Design
;alts = alt1, alt2, alt3
;rows = 36
;eff = (mnl,d)
;block = 3, minmax, noimprov(10 secs)
;alg = mfederov(stop=total(200000 iterations))
;bdraws = gauss(2)
;rep = 100
;model:
U(alt1) = b1.dummy[(n, 0.5, 0.5)] * LBP[0,1] + b2.dummy[(n, 1.5, 1.1)] *Comm[0,1,2,3] +b3.dummy[(n, 1.5,1.1)] *Opioid [0,1,2,3] + b4[(n, 1.5, 0.5)]* OpioidPainRed[1,2] + b5[(n, 5.5, 1.5)]* OpioidAEs[7,4] /
U(alt2) = b1.dummy[(n, 0.5, 0.5)] * LBP + b2.dummy[(n, 1.5, 1.1)]* Comm + b6.dummy[(n, 1.5, 1.1)]* NSAID[0,1] + b7[(n, 2, 1)]* NSAIDPain[1,3] + b8[(n, 3, 1)]* NSAIDAEs[4,2]
$

1. We are still trying to determine if some of our attributes are continuous or categorical, as although in theory they are continuous, the survey only presents with a specific number of options, in which case they could all be dummy coded as categorical? However, Ngene does not seem to like having more than 2 levels for categorical attributes as it will not run with ‘dummy’ attributes which have 3 or 4 levels each, It will run if I reduce the number of levels for these attributes to two, but I don’t think that is an option for our design. For the continuous attributes, without 'dummy' before them, the code runs but comes back with an ‘undefined’ D-error??

2. If using Dummy for categorical variables then the prior values should not be the mean and SD, is this correct? I was wondering if I should instead estimate the range for these priors instead?

3. The two attributes LBP and Comm are actually constants, so should be the same across alternatives, however I cannot work out a way to signify this in the code. I was wondering how should I handle these attributes? should we not include LBP and Comm in the alternatives as these are technically not part of the choice set, but still randomly provided in the questions which determines the responders answers to the choice set.

4. Interpreting the S-estimate; I am still trying to find this out, but it seems that the S-estimates we are getting so far are way too low. For example, does an S-estimate of 3.5 mean 4 participants in each group? obviously there must be and error in our code as this is defiantly incorrect.

I greatly appreciate your help in advance.

Regards,
Melanie
melanie
 
Posts: 15
Joined: Thu Feb 04, 2021 7:29 am

Re: Coding for categorical vs continuous attributes

Postby Michiel Bliemer » Mon Feb 08, 2021 5:47 pm

To answer your questions:

1. Dummy coding in Ngene works fine with 3 or more levels. For example:

b1.dummy[0.5|0.2] * X1[1,2,3)

Note that you need to specify L-1 priors if you have L dummy coded levels, where level L (in my case 3) is the reference level.

2. Priors for parameters are set in the same way for continuous and categorical variables. You estimate the parameters and take the mean as the prior, and the standard error as the stdev in a normally distributed Bayesian prior.

3. If levels need to be the same across alternatives, you can specify constraints:

;require:
alt1.LBP = alt2.LBP

Alternatively, you can use the following shortcut (see also Scenarios in the Ngene manual):
U(alt1) = b1.dummy[..] * LBP[0,1] ... /
U(alt2) = b1 * LBP[LBP]

4. An S-estimate of 3.5 means that you need between 3 and 4 full design observations of 36 rows. Since you block the design in 3, it means you need about 10 respondents to obtain statistically significant parameter estimates. This is a fairly low number mainly because your priors are very large; a value of 1.5 is quite large, and a value of 5.5 is extremely large. Please use priors that come from a pilot study, do not try to guess these priors as then your design could become very inefficient andyour S-estimates are not meaningful.

I assume that alt3 is the opt-out alternative (no choice) without any attributes. If so, you need to add a constant to alt1 and alt2, i.e. U(alt1) = b0[..] + b1 etc. Note that you also MUST estimate the prior for b0 in a pilot study in order to create an efficient design. If you do not have estimates of a pilot study, best to set all priors equal to zero.

You can remove ;rep = 100 because that is only for panel mixed logit models.

If you want to generate a Bayesian efficient design, please use ;eff = (mnl,d,mean) but keep the number of Bayesian priors limited to 10 or so, while other priors are fixed, as otherwise the computation time will explode.

Michiel
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm

Re: Coding for categorical vs continuous attributes

Postby melanie » Wed Feb 10, 2021 5:19 pm

Thanks so much Michiel,

Your responses are very helpful, only thing is I still seem to be making an error in the code as I either get the error message ' The model property contains a prior that has an unknown random distribution type" Or 'an incorrectly configured Bayesian component'

This is my code:

Design
;alts = alt1, alt2, alt3
;require:
alt1.LBP = alt2.LBP
alt1.Comm = alt2.Comm
;rows = 36
;eff = (mnl,d,mean)
;block = 3, minmax, noimprov(10 secs)
;alg = mfederov(stop=total(200000 iterations))
;bdraws = gauss(2)
;model:
U(alt1) = b0 + b1.dummy[n,(0.5,0.2)] * LBP[0,1] + b2.dummy[n,(0.5,0.2),(0.5,0.2),(0.5,0.2)] *Comm[0,1,2,3] +b3.dummy[n,(0.5,0.2),(0.5,0.2),(0.5,0.2)] *Opioid [0,1,2,3] + b4.dummy[n,(0.5,0.2)]* OpioidPainRed[1,2] + b5.dummy[n,(2,1)]* OpioidAEs[7,4] /
U(alt2) = b0 + b1.dummy[n,(0.5,0.2)] * LBP + b2.dummy[n,(0.5,0.2),(0.5,0.2),(0.5,0.2)]* Comm + b6.dummy[n,(0.5,0.2)]* NSAID[0,1] + b7.dummy[n,(2,1)]* NSAIDPain[1,3] + b8.dummy[n,(2,1)]* NSAIDAEs[4,2]
$
melanie
 
Posts: 15
Joined: Thu Feb 04, 2021 7:29 am

Re: Coding for categorical vs continuous attributes

Postby Michiel Bliemer » Sat Feb 13, 2021 3:38 pm

The specification of your dummy coded Bayesian priors is wrong, you need to use | to separate the priors and Bayesian priors are specified using (n,mean,sigma) with brackets around the n as well.

For example:

b2.dummy[(n,0.5,0.2)|(n,0.5,0.2)|(n,0.5,0.2)] *Comm[0,1,2,3]

Michiel
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm


Return to Choice experiments - Ngene

Who is online

Users browsing this forum: No registered users and 20 guests