Model specification dummy vs. linear coded attributes

This forum is for posts covering broader stated choice experimental design issues.

Moderators: Andrew Collins, Michiel Bliemer, johnr

Model specification dummy vs. linear coded attributes

Postby JvB » Mon Feb 13, 2023 12:39 am

Hi Michiel,

I have a question on model specification:

I have a choice experiment with three attributes - A1(b_Beitr) with only numerical levels, A2 (b_HoeheEEE) and A3 (b_ZeitEEE) with three numerical levels plus one nominal level each.

In the first approach I have dummy coded A2 and A3 entirely due to the one nominal level - see code below:

Daten_Choice_Pivot <- read.csv("DatenApollo.csv", header = TRUE)

Daten_Choice_Pivot$Beitr_1[which(Daten_Choice_Pivot$Beitr_1==1)] <- 1.6
Daten_Choice_Pivot$Beitr_1[which(Daten_Choice_Pivot$Beitr_1==2)] <- 1.8
Daten_Choice_Pivot$Beitr_1[which(Daten_Choice_Pivot$Beitr_1==3)] <- 3.3
Daten_Choice_Pivot$Beitr_1[which(Daten_Choice_Pivot$Beitr_1==4)] <- 4.8

Daten_Choice_Pivot$Beitr_2[which(Daten_Choice_Pivot$Beitr_2==1)] <- 1.6
Daten_Choice_Pivot$Beitr_2[which(Daten_Choice_Pivot$Beitr_2==2)] <- 1.8
Daten_Choice_Pivot$Beitr_2[which(Daten_Choice_Pivot$Beitr_2==3)] <- 3.3
Daten_Choice_Pivot$Beitr_2[which(Daten_Choice_Pivot$Beitr_2==4)] <- 4.8

Daten_Choice_Pivot$Beitr_3[which(Daten_Choice_Pivot$Beitr_3==1)] <- 1.6
Daten_Choice_Pivot$Beitr_3[which(Daten_Choice_Pivot$Beitr_3==2)] <- 1.8
Daten_Choice_Pivot$Beitr_3[which(Daten_Choice_Pivot$Beitr_3==3)] <- 3.3
Daten_Choice_Pivot$Beitr_3[which(Daten_Choice_Pivot$Beitr_3==4)] <- 4.8

database = Daten_Choice_Pivot
# ################################################################# #
#### DEFINE MODEL PARAMETERS ####
# ################################################################# #
### Vector of parameters, including any kept fixed during estimation
apollo_beta = c(b0 = 0, b_Beitr = 0,
b_HoeheEEE300 = 0,
b_HoeheEEE600 = 0,
b_HoeheEEE900 = 0,
b_HoeheEEEunb = 0,
b_ZeitEEE12 = 0,
b_ZeitEEE42 = 0,
b_ZeitEEE72 = 0,
b_ZeitEEEunb = 0)
### Vector with parameter names (in quotes) to be kept fixed at
# their starting values during estimation.
# Use apollo_beta_fixed = c() if none
apollo_fixed = c("b_HoeheEEEunb", "b_ZeitEEEunb")
# ################################################################# #
#### GROUP AND VALIDATE INPUTS ####
# ################################################################# #
apollo_inputs = apollo_validateInputs()
# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
# ################################################################# #
apollo_probabilities=function(apollo_beta, apollo_inputs,
functionality="estimate"){

### Function initialisation: do not change the following three commands
### Attach and detach inputs, and create empty list of probabilities
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P = list()

### List of MNL utilities: must use the same names as in mnl_settings
V = list()

V[["SQ"]] = b0 + b_Beitr*Beitr_1 + b_HoeheEEE300*(HoeheEEE_1==1) + b_HoeheEEE600*(HoeheEEE_1==2) +
b_HoeheEEE900*(HoeheEEE_1==3)+ b_HoeheEEEunb*(HoeheEEE_1==4) + b_ZeitEEE12*(ZeitEEE_1==1) +
b_ZeitEEE42*(ZeitEEE_1==2) + b_ZeitEEE72*(ZeitEEE_1==3) + b_ZeitEEEunb*(ZeitEEE_1==4)
V[["RefA"]] = b_Beitr*Beitr_2 + b_HoeheEEE300*(HoeheEEE_2==1) + b_HoeheEEE600*(HoeheEEE_2==2) +
b_HoeheEEE900*(HoeheEEE_2==3)+ b_HoeheEEEunb*(HoeheEEE_2==4) + b_ZeitEEE12*(ZeitEEE_2==1) +
b_ZeitEEE42*(ZeitEEE_2==2) + b_ZeitEEE72*(ZeitEEE_2==3) + b_ZeitEEEunb*(ZeitEEE_2==4)
V[["RefB"]] = b_Beitr*Beitr_3 + b_HoeheEEE300*(HoeheEEE_3==1) + b_HoeheEEE600*(HoeheEEE_3==2) +
b_HoeheEEE900*(HoeheEEE_3==3)+ b_HoeheEEEunb*(HoeheEEE_3==4) + b_ZeitEEE12*(ZeitEEE_3==1) +
b_ZeitEEE42*(ZeitEEE_3==2) + b_ZeitEEE72*(ZeitEEE_3==3) + b_ZeitEEEunb*(ZeitEEE_3==4)
________________________
In my second approach I have separated A2 and A3 in one linear coded part (containing the three numerical levels) and one dummy coded part (containing 1 if nominal level is choosen and 0 if not) - see code below:

Daten_Choice_Pivot <- read.csv("DatenApollo.csv", header = TRUE)

#Beitrag
Daten_Choice_Pivot$Beitr_1[which(Daten_Choice_Pivot$Beitr_1==1)] <- 1.6
Daten_Choice_Pivot$Beitr_1[which(Daten_Choice_Pivot$Beitr_1==2)] <- 1.8
Daten_Choice_Pivot$Beitr_1[which(Daten_Choice_Pivot$Beitr_1==3)] <- 3.3
Daten_Choice_Pivot$Beitr_1[which(Daten_Choice_Pivot$Beitr_1==4)] <- 4.8

Daten_Choice_Pivot$Beitr_2[which(Daten_Choice_Pivot$Beitr_2==1)] <- 1.6
Daten_Choice_Pivot$Beitr_2[which(Daten_Choice_Pivot$Beitr_2==2)] <- 1.8
Daten_Choice_Pivot$Beitr_2[which(Daten_Choice_Pivot$Beitr_2==3)] <- 3.3
Daten_Choice_Pivot$Beitr_2[which(Daten_Choice_Pivot$Beitr_2==4)] <- 4.8

Daten_Choice_Pivot$Beitr_3[which(Daten_Choice_Pivot$Beitr_3==1)] <- 1.6
Daten_Choice_Pivot$Beitr_3[which(Daten_Choice_Pivot$Beitr_3==2)] <- 1.8
Daten_Choice_Pivot$Beitr_3[which(Daten_Choice_Pivot$Beitr_3==3)] <- 3.3
Daten_Choice_Pivot$Beitr_3[which(Daten_Choice_Pivot$Beitr_3==4)] <- 4.8

# HoeheEEE
Daten_Choice_Pivot$HoeheEEE_1[which(Daten_Choice_Pivot$HoeheEEE_1==1)] <- 3
Daten_Choice_Pivot$HoeheEEE_1[which(Daten_Choice_Pivot$HoeheEEE_1==2)] <- 6
Daten_Choice_Pivot$HoeheEEE_1[which(Daten_Choice_Pivot$HoeheEEE_1==3)] <- 9

#Daten_Choice_Pivot$HoeheEEE_2[which(Daten_Choice_Pivot$HoeheEEE_2==1)] <- 3
#Daten_Choice_Pivot$HoeheEEE_2[which(Daten_Choice_Pivot$HoeheEEE_2==2)] <- 6
#Daten_Choice_Pivot$HoeheEEE_2[which(Daten_Choice_Pivot$HoeheEEE_2==3)] <- 9

Daten_Choice_Pivot$HoeheEEE_2 <- ifelse(Daten_Choice_Pivot$HoeheEEE_2==1, 3,
ifelse(Daten_Choice_Pivot$HoeheEEE_2==2, 6,
ifelse(Daten_Choice_Pivot$HoeheEEE_2==3, 9, 4)))

#Daten_Choice_Pivot$HoeheEEE_3[which(Daten_Choice_Pivot$HoeheEEE_3==1)] <- 3
#Daten_Choice_Pivot$HoeheEEE_3[which(Daten_Choice_Pivot$HoeheEEE_3==2)] <- 6
#Daten_Choice_Pivot$HoeheEEE_3[which(Daten_Choice_Pivot$HoeheEEE_3==3)] <- 9


Daten_Choice_Pivot$HoeheEEE_3 <- ifelse(Daten_Choice_Pivot$HoeheEEE_3==1, 3,
ifelse(Daten_Choice_Pivot$HoeheEEE_3==2, 6,
ifelse(Daten_Choice_Pivot$HoeheEEE_3==3, 9, 4)))

#ZeitEEE
Daten_Choice_Pivot$ZeitEEE_1[which(Daten_Choice_Pivot$ZeitEEE_1==1)] <- 1.2
Daten_Choice_Pivot$ZeitEEE_1[which(Daten_Choice_Pivot$ZeitEEE_1==2)] <- 4.2
Daten_Choice_Pivot$ZeitEEE_1[which(Daten_Choice_Pivot$ZeitEEE_1==3)] <- 7.2

Daten_Choice_Pivot$ZeitEEE_2[which(Daten_Choice_Pivot$ZeitEEE_2==1)] <- 1.2
Daten_Choice_Pivot$ZeitEEE_2[which(Daten_Choice_Pivot$ZeitEEE_2==2)] <- 4.2
Daten_Choice_Pivot$ZeitEEE_2[which(Daten_Choice_Pivot$ZeitEEE_2==3)] <- 7.2

Daten_Choice_Pivot$ZeitEEE_3[which(Daten_Choice_Pivot$ZeitEEE_3==1)] <- 1.2
Daten_Choice_Pivot$ZeitEEE_3[which(Daten_Choice_Pivot$ZeitEEE_3==2)] <- 4.2
Daten_Choice_Pivot$ZeitEEE_3[which(Daten_Choice_Pivot$ZeitEEE_3==3)] <- 7.2

database = Daten_Choice_Pivot

# ################################################################# #
#### DEFINE MODEL PARAMETERS ####
# ################################################################# #
### Vector of parameters, including any kept fixed during estimation
apollo_beta = c(b0 = 0,
b_Beitr = 0,
b_HoeheEEE = 0,
b_HoeheEEEunb = 0, # dummy
b_ZeitEEE = 0,
b_ZeitEEEunb = 0) # dummy
### Vector with parameter names (in quotes) to be kept fixed at
# their starting values during estimation.
# Use apollo_beta_fixed = c() if none
apollo_fixed = c() #"b_HoeheEEEunb", "b_ZeitEEEunb")
# ################################################################# #
#### GROUP AND VALIDATE INPUTS ####
# ################################################################# #
apollo_inputs = apollo_validateInputs()
# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
# ################################################################# #
apollo_probabilities=function(apollo_beta, apollo_inputs,
functionality="estimate"){

### Function initialisation: do not change the following three commands
### Attach and detach inputs, and create empty list of probabilities
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P = list()


### List of MNL utilities: must use the same names as in mnl_settings
V = list()

V[["SQ"]] = b0 + b_Beitr*Beitr_1 + b_HoeheEEE*HoeheEEE_1 + b_HoeheEEEunb*(HoeheEEE_1==4) + b_ZeitEEE*ZeitEEE_1 +
+ b_ZeitEEEunb*(ZeitEEE_1==4)
V[["RefA"]] = b_Beitr*Beitr_2 + b_HoeheEEE*(HoeheEEE_2==3|HoeheEEE_2==6|HoeheEEE_2==9)*HoeheEEE_2 + b_HoeheEEEunb*(HoeheEEE_2==4) + b_ZeitEEE*ZeitEEE_2 +
+ b_ZeitEEEunb*(ZeitEEE_2==4)
V[["RefB"]] = b_Beitr*Beitr_3 + b_HoeheEEE*(HoeheEEE_3==3|HoeheEEE_3==6|HoeheEEE_3==9)*HoeheEEE_3 + b_HoeheEEEunb*(HoeheEEE_3==4) + b_ZeitEEE*ZeitEEE_3 +
+ b_ZeitEEEunb*(ZeitEEE_3==4)

___________________________________________________
Do you see any problems with the second approach?

I am thankful for any insights.

Thank you very much in advance.
Best,
J.
JvB
 
Posts: 47
Joined: Mon Mar 22, 2021 12:17 am

Re: Model specification dummy vs. linear coded attributes

Postby Michiel Bliemer » Wed Feb 15, 2023 3:29 pm

I am not very proficient in Apollo or R so I am not able to understand or comment on your code (you may want to post on the Apollo forum), but just from what you are writing, if an attribute has one categorical level (level 1) and three numerical levels (levels 2, 3, 4), I would probably split it into dummy coded variable with 2 levels and a numerical variable with 3 levels, where the utility function becomes something like:

U = ... + b1 * (A2==1) + b2 * (A2<>1) * A2 + ...

In other words, you have a parameter for level 1 (versus the reference of not being level 1), and a parameter for the numerical levels that should only be included if A2 is not level 1.

This is just an idea and I have not tried this, but I think it makes sense.
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm

Re: Model specification dummy vs. linear coded attributes

Postby JvB » Wed Feb 15, 2023 7:10 pm

Hi Michiel,

thank you for your response. What you are writing is exactly what my code does. I will post it to the Apollo forum again to check for the code itself.
But thank you for commenting on the general idea. Very helpful!

Best,
J.
JvB
 
Posts: 47
Joined: Mon Mar 22, 2021 12:17 am


Return to Choice experiments - general

Who is online

Users browsing this forum: No registered users and 16 guests