Non-linearity for continuous variables

This forum is for posts that specifically focus on Ngene.

Moderators: Andrew Collins, Michiel Bliemer, johnr

Non-linearity for continuous variables

Postby connie » Sun Dec 08, 2019 2:14 am

Hi Professors and Experts:

I have six attributes which I consider all of them are continuous variables. I have two questions:

1. In my study, I think the ideal situation is forced choices without opt-out. I tried to design a forced choices without an opt-out. However, Ngene shows 'undefined'. The code runs well if I add an opt-out.The code also runs well if I remove 'require' condition. Any suggestions for this issue? Thank you!

Code: Select all
Design
;alts=alt1,alt2
;rows=12
;eff=(mnl,d)
;alg=mfederov
;require:
alt1.A+alt1.B+alt1.C+alt1.D+alt1.E=alt2.A+alt2.B+alt2.C+alt2.D+alt2.E,
alt1.A+alt1.B+alt1.C+alt1.D+alt1.E<=20,
alt2.A+alt2.B+alt2.C+alt2.D+alt2.E<=20
;model:
U(alt1)=b01[0]+
        b1[0]*A[4,6,8,10](1-4,1-4,1-4,1-4,1-4)+
        b2[0]*B[0,2,4,6](1-5,1-4,1-4,1-4)+
        b3[0]*C[0,2,4,6](1-4,1-4,1-4,1-4)+
        b4[0]*D[0,1,2,3]+
        b5[0]*E[6,7,8,9](1-5,1-5,1-5,1-5)+
        b6[0]*F[9,12,15,18,21](1-4,1-4,1-4,1-4,1-4)
/
U(alt2)=b1*A+
        b2*B+
        b3*C+
        b4*D+
        b5*E+
        b6*F
$


2. To relax the linear assumption, I also tried the non-linear assumption by considering the first five attributes as categorical variables rather than continuous variables. The code is similar to the above one except I add 'dummy' before the first five attributes. The issue is MNL covariance matrix that Ngene generated looks wired.The matrix looks fine if I considered attributes as continuous variables. I am not sure whether I need to take this matrix seriously.

I will have only 200 sample size in my data. So I do not want to complex the design as it may require large sample size. The non-linear assumption will increase the number of parameters and the number of choice sets each individual faces. My stupid question is may I consider the attributes as continuous variables in Ngene design, and in analytical part, I test non-linearity by just dummy those continuous variables? I saw there is suggestion for nonlinear transformation using log or square. Any suggestions in my case? Thank you.

The code:
Code: Select all
[code]
Design
;alts=alt1, alt2, none
;rows=12
;eff=(mnl,d)
;alg=mfederov
;require:
alt1.A+alt1.B+alt1.C+alt1.D+alt1.E=alt2.A+alt2.B+alt2.C+alt2.D+alt2.E,
alt1.A+alt1.B+alt1.C+alt1.D+alt1.E<=20,
alt2.A+alt2.B+alt2.C+alt2.D+alt2.E<=20
;model:
U(alt1)=b01[0]+
        b1.dummy[0|0|0]*A[4,6,8,10]+
        b2.dummy[0|0|0]*B[0,2,4,6]+
        b3.dummy[0|0|0]*C[0,2,4,6]+
        b4.dummy[0|0|0]*D[0,1,2,3]+
        b5.dummy[0|0|0]*E[6,7,8,9]+
        b6[0]*F[9,12,15,18,21](1-3,1-3,1-3,1-3,1-3)
/
U(alt2)=b1*A+
        b2*B+
        b3*C+
        b4*D+
        b5*E+
        b6*F
$


The following is part of MNL covariance matrix generated from Ngene.

MNL covariance matrix
Prior b01 b1(d0) b1(d1) b1(d2) b2(d0) b2(d1) b2(d2) b3(d0)
b01 49.358466 -16.903859 -12.545522 -6.151441 -17.13067 -10.738078 -6.438699 -16.422725
b1(d0) -16.903859 7.886921 5.818682 3.75669 5.86381 3.27982 1.211025 4.976761
b1(d1) -12.545522 5.818682 5.639421 3.175691 4.137761 2.328256 1.020922 3.894463
b1(d2) -6.151441 3.75669 3.175691 4.322462 2.448045 1.478313 0.118615 1.076033
b2(d0) -17.13067 5.86381 4.137761 2.448045 8.248761 5.385899 3.624088 5.542397
b2(d1) -10.738078 3.27982 2.328256 1.478313 5.385899 5.162557 3.337512 3.47904
b2(d2) -6.438699 1.211025 1.020922 0.118615 3.624088 3.337512 3.669265 2.208844
b3(d0) -16.422725 4.976761 3.894463 1.076033 5.542397 3.47904 2.208844 8.052228
b3(d1) -12.291671 3.540414 2.755003 0.634705 3.773113 2.292762 1.406605 5.961626
b3(d2) -6.473694 1.091626 1.107914 -0.453802 1.74347 1.464988 0.916929 4.575568
b4(d0) -8.154999 2.918653 1.511069 0.79502 2.613869 1.552907 0.817293 2.239433
b4(d1) -5.786584 1.699097 0.804938 0.519681 1.614269 1.082255 0.669 1.805383
b4(d2) -4.527428 1.814926 1.241008 1.767892 1.63062 1.187745 0.37446 0.458844
b5(d0) -8.191402 2.673152 2.152325 0.765444 2.674721 1.540421 0.819318 3.32214
b5(d1) -6.508215 2.096138 1.918387 0.981714 2.224329 1.089844 0.80548 2.038644
b5(d2) -4.693104 1.674288 1.681206 1.265461 1.677781 0.769887 0.372343 1.369694
b6 -0.294602 0.070244 0.02958 -0.038041 0.033454 -0.019679 -0.010246 0.029593
b02 49.218445 -16.704556 -12.5894 -6.034597 -17.200416 -10.722401 -6.470932 -16.6643
connie
 
Posts: 16
Joined: Wed Nov 20, 2019 7:23 pm

Re: Non-linearity for continuous variables

Postby Michiel Bliemer » Mon Dec 09, 2019 11:01 am

I will answer your questions separately.

Question 1:

The first constraint (alt1.A + ... = alt2.A + ...) is very restrictive and seems to result in data where some of the parameters are no longer identifiable. If you remove this constraint then Ngene can find a design with a normal D-error. So you may be over-constraining your design, leading to loss of identifiability. Adding an opt-out alternative increases the degrees of freedom, which seems to have a positive effect on parameter identifiability. Could you perhaps relax this constraint a little bit, for example (see the ;reject constraint that replaces the first constraint, where I allow the sums to deviate with 1):

Design
;alts=alt1,alt2
;rows=12
;eff=(mnl,d)
;alg=mfederov
;require:
alt1.A+alt1.B+alt1.C+alt1.D+alt1.E<=20,
alt2.A+alt2.B+alt2.C+alt2.D+alt2.E<=20
;reject:
alt1.A+alt1.B+alt1.C+alt1.D+alt1.E-alt2.A-alt2.B-alt2.C-alt2.D-alt2.E > 1,
alt1.A+alt1.B+alt1.C+alt1.D+alt1.E-alt2.A-alt2.B-alt2.C-alt2.D-alt2.E < -1
;model:
U(alt1)=b01[0]+
b1[0]*A[4,6,8,10](1-4,1-4,1-4,1-4)+
b2[0]*B[0,2,4,6](1-5,1-4,1-4,1-4)+
b3[0]*C[0,2,4,6](1-4,1-4,1-4,1-4)+
b4[0]*D[0,1,2,3]+
b5[0]*E[6,7,8,9](1-5,1-5,1-5,1-5)+
b6[0]*F[9,12,15,18,21](1-4,1-4,1-4,1-4,1-4)
/
U(alt2)=b1*A+
b2*B+
b3*C+
b4*D+
b5*E+
b6*F
$

Michiel
Michiel Bliemer
 
Posts: 1888
Joined: Tue Mar 31, 2009 4:13 pm

Re: Non-linearity for continuous variables

Postby Michiel Bliemer » Mon Dec 09, 2019 11:12 am

For your second and third question:

When I run the syntax, the covariance matrix looks fine, see below. Are you using the latest Ngene version?

MNL covariance matrix
Prior b01 b1(d0) b1(d1) b1(d2) b2(d0) b2(d1) b2(d2) b3(d0) b3(d1) b3(d2) b4(d0) b4(d1) b4(d2) b5(d0) b5(d1) b5(d2) b6
b01 1.005503 -0.137401 -0.182289 -0.418416 -0.623447 -0.036743 -0.666462 0.162787 -0.160279 0.404008 -0.018584 0.003189 -0.400062 0.455493 0.141752 0.353889 -0.014986
b1(d0) -0.137401 2.31982 1.811634 1.414274 0.123781 -0.39391 -0.722625 0.73525 0.440804 0.23899 -0.080897 -0.37702 -0.018649 -0.074048 -0.175562 -0.080294 -0.112401
b1(d1) -0.182289 1.811634 3.14582 1.712895 -0.676996 -0.333901 -0.556547 0.762606 0.6119 0.388166 -0.206044 -0.45223 -0.14943 0.18904 0.343958 0.590131 -0.130047
b1(d2) -0.418416 1.414274 1.712895 3.052344 -0.474349 -0.695832 -0.252601 0.11309 0.514879 0.302222 -0.505095 -0.325243 0.060219 -0.572062 -0.821151 -0.65811 -0.028876
b2(d0) -0.623447 0.123781 -0.676996 -0.474349 2.938243 1.139831 1.086003 -0.184084 -0.516048 -1.543351 0.148171 0.012524 -0.183034 -0.121522 -0.599929 -0.778138 -0.012079
b2(d1) -0.036743 -0.39391 -0.333901 -0.695832 1.139831 1.86714 0.570607 -0.632649 -0.959085 -1.176108 0.273256 0.285185 0.071637 0.356199 -0.121689 -0.417673 0.003222
b2(d2) -0.666462 -0.722625 -0.556547 -0.252601 1.086003 0.570607 3.92962 0.519714 0.174503 -0.384859 -0.285482 -1.194462 -0.779093 -0.292581 0.543295 0.140303 0.003285
b3(d0) 0.162787 0.73525 0.762606 0.11309 -0.184084 -0.632649 0.519714 2.813836 1.301649 1.486364 -0.280942 -1.174196 -1.572426 -0.137606 0.260395 0.277615 -0.101158
b3(d1) -0.160279 0.440804 0.6119 0.514879 -0.516048 -0.959085 0.174503 1.301649 2.435999 1.601777 0.094249 -0.616878 -0.098163 -0.449239 -0.020247 0.07002 -0.068881
b3(d2) 0.404008 0.23899 0.388166 0.302222 -1.543351 -1.176108 -0.384859 1.486364 1.601777 3.037074 -0.380926 -0.382554 -0.364897 -0.273636 0.024165 0.500634 -0.052319
b4(d0) -0.018584 -0.080897 -0.206044 -0.505095 0.148171 0.273256 -0.285482 -0.280942 0.094249 -0.380926 1.668559 0.769803 0.868145 0.105147 0.090575 -0.390362 -0.035877
b4(d1) 0.003189 -0.37702 -0.45223 -0.325243 0.012524 0.285185 -1.194462 -1.174196 -0.616878 -0.382554 0.769803 2.116934 1.306551 -0.264805 -0.497964 -0.443565 0.028245
b4(d2) -0.400062 -0.018649 -0.14943 0.060219 -0.183034 0.071637 -0.779093 -1.572426 -0.098163 -0.364897 0.868145 1.306551 2.74754 0.008177 -0.249021 -0.300586 0.008211
b5(d0) 0.455493 -0.074048 0.18904 -0.572062 -0.121522 0.356199 -0.292581 -0.137606 -0.449239 -0.273636 0.105147 -0.264805 0.008177 2.100568 1.299851 1.194848 -0.074342
b5(d1) 0.141752 -0.175562 0.343958 -0.821151 -0.599929 -0.121689 0.543295 0.260395 -0.020247 0.024165 0.090575 -0.497964 -0.249021 1.299851 2.308166 1.534157 -0.070958
b5(d2) 0.353889 -0.080294 0.590131 -0.65811 -0.778138 -0.417673 0.140303 0.277615 0.07002 0.500634 -0.390362 -0.443565 -0.300586 1.194848 1.534157 2.567197 -0.069598
b6 -0.014986 -0.112401 -0.130047 -0.028876 -0.012079 0.003222 0.003285 -0.101158 -0.068881 -0.052319 -0.035877 0.028245 0.008211 -0.074342 -0.070958 -0.069598 0.015851

The design does not get more complex with the dummy coding, you are still using the same number of rows and the respondent does not know how many parameters you are estimating. If you want to test for nonlinearities, I suggest that you design for dummy coding (so that you will be able to estimate all dummy coded parameters) and then test in estimation whether they are actually linear or not. You may want to increase the number of rows, since 12 rows is not a lot for estimating so many parameters.

In case you include the opt-out alternative, it is best that you remove the b01 constant in alt1 (I assume that alt1 and alt2 are generic alternatives and should not have a different constant) and put a constant in the none alternative.

Michiel
Michiel Bliemer
 
Posts: 1888
Joined: Tue Mar 31, 2009 4:13 pm

Re: Non-linearity for continuous variables

Postby connie » Mon Dec 09, 2019 11:23 pm

Thank you so much, Professor Michiel. Very appreciated the valuable suggestions you gave me. The design improves a lot based on your recommended code. I think the Ngene I used may be an old version, will verify it. - Connie
connie
 
Posts: 16
Joined: Wed Nov 20, 2019 7:23 pm

Re: Non-linearity for continuous variables

Postby connie » Sat Dec 14, 2019 1:19 am

Hi Professor Michiel:

I used your recommended code in two designs, one is under linearity assumption and the other is under non-linearity assumption. The co-variance variance matrix is not zero off-diagonal under non linearity assumption. I simulated the data and results seems acceptable. I am not sure whether the values in variance co-variance matrix in my nonlinear design is acceptable? If I can run the results under simulated data, does it indicate I can use it even though the variance co-variance matrix is not perfect? Or should I turn to linearity assumption? Design under linear assumption has no such issue. Thank you!

Code: Select all
Design   
;alts=alt1,alt2   
;rows=24   
;block=2   
;eff=(mnl,d)   
;alg=mfederov   
;require:   
alt1.A+alt1.B+alt1.C+alt1.D+alt1.E<=20,   
alt2.A+alt2.B+alt2.C+alt2.D+alt2.E<=20   
;reject:   
alt1.A+alt1.B+alt1.C+alt1.D+alt1.E-alt2.A-alt2.B-alt2.C-alt2.D-alt2.E > 1,   
alt1.A+alt1.B+alt1.C+alt1.D+alt1.E-alt2.A-alt2.B-alt2.C-alt2.D-alt2.E <-1   
;model:   
U(alt1)=b01[0]+   
b1.dummy[0|0|0]*A[4,6,8,10](1-8,1-8,1-8,1-8)+   
b2.dummy[0|0|0]*B[0,2,4,6](1-8,1-8,1-8,1-8)+   
b3.dummy[0|0|0]*C[0,2,4,6](1-8,1-8,1-8,1-8)+   
b4.dummy[0|0|0]*D[0,1,2,3]+   
b5.dummy[0|0|0]*E[6,7,8,9]+   
b6[0]*F[9,12,15,18,21](1-6,2-8,2-8,2-8,1-6)   
/   
U(alt2)=b1*A+   
b2*B+   
b3*C+   
b4*D+   
b5*E+   
b6*F   
$   


The co variance matrix is:
Code: Select all
Prior   b01   b1(d0)   b1(d1)   b1(d2)   b2(d0)   b2(d1)   b2(d2)   b3(d0)   b3(d1)   b3(d2)
b01   0.17486   0.00594   -0.000109   -0.020336   -0.006022   0.011035   -0.01045   -0.0173   -0.03471   -0.026596
b1(d0)   0.00594   9.131044   6.155713   3.7576   8.140076   5.266413   2.593433   8.199093   5.552758   2.742321
b1(d1)   -0.000109   6.155713   4.567272   2.625978   5.396647   3.514717   1.749044   5.51826   3.653804   1.753659
b1(d2)   -0.020336   3.7576   2.625978   2.037126   3.224438   2.064057   1.052096   3.114543   2.137326   1.023019
b2(d0)   -0.006022   8.140076   5.396647   3.224438   8.588638   5.552304   3.118597   7.909196   5.3524   2.662488
b2(d1)   0.011035   5.266413   3.514717   2.064057   5.552304   4.045916   2.164869   4.939926   3.282136   1.523878
b2(d2)   -0.01045   2.593433   1.749044   1.052096   3.118597   2.164869   1.716783   2.466757   1.535371   0.663734
b3(d0)   -0.0173   8.199093   5.51826   3.114543   7.909196   4.939926   2.466757   9.013802   6.285131   3.585743
b3(d1)   -0.03471   5.552758   3.653804   2.137326   5.3524   3.282136   1.535371   6.285131   4.790035   2.759414
b3(d2)   -0.026596   2.742321   1.753659   1.023019   2.662488   1.523878   0.663734   3.585743   2.759414   2.110419
b4(d0)   0.02146   4.097119   2.725231   1.580658   3.767765   2.461229   1.142759   3.971384   2.774871   1.469062
b4(d1)   0.022221   2.941517   1.963028   1.137737   2.744267   1.851965   0.805569   2.687998   1.800199   0.815395
b4(d2)   -0.00876   1.085001   0.787457   0.448025   1.126756   0.709384   0.398517   1.127183   0.824217   0.384157
b5(d0)   0.04151   4.664463   3.112576   1.933486   4.387701   2.826319   1.333323   4.513996   3.083517   1.51593
b5(d1)   0.002814   3.058435   2.029786   1.306751   2.990632   1.838379   0.945704   3.11776   2.194599   1.081582
b5(d2)   0.01275   1.730548   1.098872   0.829918   1.62087   1.097478   0.502728   1.663159   1.213587   0.561931
b6   0.002597   0.014193   0.012003   -0.000539   0.000545   0.003745   -0.01071   0.01917   0.01427   0.00908


The simulated results are:
Code: Select all
y   Coef.           Std. Err.              z           P>z
A   -0.0064484   0.0642858            -0.1           0.92
B   -0.0150374   0.0632658           -0.24        0.812
C   -0.0251328   0.0648365         -0.39        0.698
D   -0.0229512   0.0724524           -0.32   0.751
E   0.0402893            0.0753023         0.54   0.593
F   -0.0081052   0.0077393           -1.05   0.295

connie
 
Posts: 16
Joined: Wed Nov 20, 2019 7:23 pm

Re: Non-linearity for continuous variables

Postby Michiel Bliemer » Sun Dec 15, 2019 8:49 am

Why do you think that the off-diagonal elements in the covariance matrix need to be zero? These are never zero for discrete choice models (because the logit model is a nonlinear model unlike a linear regression model).

Michiel
Michiel Bliemer
 
Posts: 1888
Joined: Tue Mar 31, 2009 4:13 pm


Return to Choice experiments - Ngene

Who is online

Users browsing this forum: No registered users and 6 guests

cron