a possible fatal error in Ngene unlabeled orthogonal design

This forum is for posts covering broader stated choice experimental design issues.

Moderators: Andrew Collins, Michiel Bliemer, johnr

a possible fatal error in Ngene unlabeled orthogonal design

Postby J.wang » Fri Aug 10, 2018 9:25 pm

Dear Ngene team:

I am a user from Eindhoven University of Technology. I may report a fatal error in Ngene unlabeled orthogonal design. I hope your team can detect it and reply.

The occasion is like this:
1. I used Ngene “seq” command to form a 12 four-level and 3 two-level unlabeled design. (please see the attachment1,2,2-1) Then I replace the original design with my real levels.
Code: Select all
Design
;alts = alt1,alt2,
;rows = 64
;orth=seq
;block =8
;model:
U(alt1) = b0
+ b1 * CS_Price[-3,-1,1,3]
+ b2 * T_Book[-3,-1,1,3]
+ b3 * T_Home_to_CS[-3,-1,1,3]
+ b4 * T_CS_to_Destination[-3,-1,1,3]
+ b5 * D_Home_to_Workplace[-3,-1,1,3]
+ b6 * Location[-3,-1,1,3]
+ b7 * T_Home_to_PT[-3,-1,1,3]
+ b8 * PC_Parking_Number[-3,-1,1,3]
+ b9 * Density_green[-1,1]
+ b10* Housing_type[-3,-1,1,3]
+ b11* Housing_owership[-1,1]
+ b12* Living_size[-3,-1,1,3]
+ b13* Housing_cost[-3,-1,1,3]
+ b14* Housing_built_time[-3,-1,1,3]
+ b15* Safe[-1,1]/

U(alt2) = 
+ b1 * CS_Price
+ b2 * T_Book
+ b3 * T_Home_to_CS
+ b4 * T_CS_to_Destination
+ b5 * D_Home_to_Workplace
+ b6 * Location
+ b7 * T_Home_to_PT
+ b8 * PC_Parking_Number
+ b9 * Density_green
+ b10* Housing_type
+ b11* Housing_owership
+ b12* Living_size
+ b13* Housing_cost
+ b14* Housing_built_time
+ b15* Safe $


2. Based on that we did 360 data collection in Netherland. After the data collection, we did the MNL analysis in Nlogit. But Nlogit always reported errors that “Hessian is not negative semidefinite” (in the table too many fixed parameters). In the R, the analysis also failed. (please attachment 3,4,4.1,4.2,4.3)Image.

3. As we checked every step, our technology support found that, in the Ngene design, levels displayed a very wired combination. They combination between them are always like 3,3,3,-3…..or 1,-1,1,1,1…. among the four-level attributes. This is no (-)3 with (-)1. It can also be seen from the after coding correlation check and crosstabulation check (see attachment 5,5.1) . What’s more, among all the attributes we found only the levels of “Housing type” and “Housing size” combine with every levels from other attributes (attachment 6). This results also have relationship after coding correlation and crosstabulation (attachment 5.1)ImageImage, in which only these two attributes have a perfect 0 correlation after coding. Thus, then when we ran the MNL estimation in the Nlogit, adding all levels of these two attributes will not cause fixed parameter problem. But others will.
So based on that, we doubt, the combination of your “seq” design like 3,3,3,-3…..or 1,-1,1,1,1…. may lead to fatal errorImage.

Please have a check of your algorithm. Let fine out whether “seq” design is right.
J.wang
 
Posts: 6
Joined: Fri Aug 10, 2018 1:20 am

Re: a possible fatal error in Ngene unlabeled orthogonal des

Postby Michiel Bliemer » Sat Aug 11, 2018 10:11 pm

I am not entirely sure if I understand the problem since your attachments are not visible on the forum, could you perhaps send it to contact@choice-metrics.com? I think I know what is happening and I outline it below, but maybe it is something else and we need to investigate further.

Ngene either determines an orthogonal or near-orthogonal array in which all correlations between the main effects are zero. In a near-orthogonal array each pair of attribute levels may not occur equally and some pairs may not occur at all. The sequential (near-)orthogonal design that Ngene generates can be used to estimate the model that you have specified without any problem. You specified a model with 15 parameters in which all attributes are assumed continuous (not dummy or effects coded) and this model is identifiable.

I suspect that the issue is that you are trying to estimate a model with dummy/effects coding instead of a model with linear coding as you have specified in the Ngene syntax? I believe that orthogonal arrays allow estimating models with dummy/effects coding, but near-orthogonal designs may notsince it is possible that certain attribute level combinations do not appear in the design. You can verify whether you can estimate the model with dummy coding by adding .dummy[0|0|0] to each parameter in your syntax and then click on "MNL properties" in the design window in Ngene. I tried this and Ngene reports "Undefined" D-error for the sequential near-orthogonal design, which means that the model cannot be estimated if all attributes are dummy coded. Creating a larger design typically overcomes this issue, e.g. using ;rows = 72 or larger would generate a (near-)orthogonal design that allows estimating dummy coded variables. Or one could generate an efficient design (with 64 rows or smaller), which of course also allows estimating dummy variables (since efficient designs are optimised for estimation).

In most cases, a near-orthogonal design is just as good for model estimation as an orthogonal array and does not lead to any issues. But I think that in your case it turned out to be problematic, and that is very unfortunate, we have not heard of such issues before.

We will consider putting warnings in the output window when a a near-orthogonal design was generated instead of an orthogonal design. Further, it is good practice to always investigate the MNL properties of the design to verify that the model can be estimated with the experimental design (where it is important that the utility functions are formulated with appropriate coding schemes).

Michiel
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm

Re: a possible fatal error in Ngene unlabeled orthogonal des

Postby J.wang » Sat Aug 11, 2018 11:48 pm

Dear Michiel:
Thanks for your reply.
I have already E-mailed you with very detailed materials about my problem. Please check the E-mail.
Hope we can work out a.s.a.p.
J.wang
 
Posts: 6
Joined: Fri Aug 10, 2018 1:20 am

Re: a possible fatal error in Ngene unlabeled orthogonal des

Postby Michiel Bliemer » Sun Aug 12, 2018 12:35 am

I played a bit with the utility functions to see which dummy coded variables can be estimated. You can add many but not all. The syntax below generates a design that can be estimated with indicated dummy variables. You may want to try different combinations of dummy variables based on which attributes are more important to dummy code than others, I did not try many combinations and my selection may not be the best. You can evaluate your own design using the ;eval command and look at the D-error of the design (if it is not Undefined, the model parameters can be estimated). For attributes that you cannot dummy code, you can assume a ranking/ordering (e.g. level 1=worst, level 4=best) and estimate a single parameter that represents a linear effect.

Code: Select all
    Design
    ;alts = alt1,alt2
    ;rows = 64
    ;orth = sim
    ;block =8
    ;model:
    U(alt1) =
      b1.dummy[0|0|0] * CS_Price[-3,-1,1,3]
    + b2.dummy[0|0|0] * T_Book[-3,-1,1,3]
    + b3 * T_Home_to_CS[-3,-1,1,3]
    + b4.dummy[0|0|0] * T_CS_to_Destination[-3,-1,1,3]
    + b5 * D_Home_to_Workplace[-3,-1,1,3]
    + b6.dummy[0|0|0] * Location[-3,-1,1,3]
    + b7 * T_Home_to_PT[-3,-1,1,3]
    + b8.dummy[0|0|0] * PC_Parking_Number[-3,-1,1,3]
    + b9.dummy[0] * Density_green[-1,1]
    + b10.dummy[0|0|0] * Housing_type[-3,-1,1,3]
    + b11.dummy[0]* Housing_owership[-1,1]
    + b12.dummy[0|0|0]* Living_size[-3,-1,1,3]
    + b13 * Housing_cost[-3,-1,1,3]
    + b14.dummy[0|0|0]* Housing_built_time[-3,-1,1,3]
    + b15.dummy[0]* Safe[-1,1]/

    U(alt2) =
      b1 * CS_Price
    + b2 * T_Book
    + b3 * T_Home_to_CS
    + b4 * T_CS_to_Destination
    + b5 * D_Home_to_Workplace
    + b6 * Location
    + b7 * T_Home_to_PT
    + b8 * PC_Parking_Number
    + b9 * Density_green
    + b10* Housing_type
    + b11* Housing_owership
    + b12* Living_size
    + b13* Housing_cost
    + b14* Housing_built_time
    + b15* Safe $


Michiel
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm

Re: a possible fatal error in Ngene unlabeled orthogonal des

Postby J.wang » Sun Aug 12, 2018 2:03 am

Thanks for your reply. Based on your words, do you mean I need to announce which attribute's levels which I want to estimate later as dummy or effect? Otherwise, i can not estimate the dummy?

My doubts now are as follows:
1. Based on your technical explanation, i think my design now at least can get a linear effect, right? if i can, then we are all right. if i can't, do you mean Ngene is wrong?

2. About coding, as long as i know, researchers in our group never did coding in the design step, we always do the coding in the analysis step, which also works quite well. So i a little bit doubt your technical guides.

3. You still do not explain the weird combination issue, like 3,3,3,-3…..or 1,-1,1,1,1…all the time, which whether may lead to fatal error or not in the data analysis.

4. I have to seriously state my main question that whether my design is wrong and who caused the error? Now the case is that we have already used your design to collect 400 data which cost about €1500. if we made the mistakes at the beginning, we of course will pay the cost for the data collection. if the fault is from your software, we need you give us an account. So now my main doubt is not how to form a design because we have already formed another one through SAS, but to find out whether my design is wrong and who caused the error. I have to ask you to answer my questions directly.
J.wang
 
Posts: 6
Joined: Fri Aug 10, 2018 1:20 am

Re: a possible fatal error in Ngene unlabeled orthogonal des

Postby J.wang » Sun Aug 12, 2018 6:28 am

Dear: i checkked again with your coding.

1. I noticed that, you change the "seq" to "sim" in your updated coding. i tested that before, in the "sim", there will be random combination between levels, but there may be same alternatives within one chice set. so we then used "seq" which has the problem of specific combination. By using "seq" design, even we increase the row to "256", the combinations are always like " 3,3,3,-3…..or 1,-1,1,1,1" which is not resonable.

2. As you said, if the combinition of the two levels never appears in the design, then we can not estimate the dummy coding between them? Actually, we found only the levels of “Housing type” and “Housing size” combine with every levels, so that is why they can be used in the effect coding estimate? which also means most of my attributes can not be used for effect coding analysis, right?
J.wang
 
Posts: 6
Joined: Fri Aug 10, 2018 1:20 am

Re: a possible fatal error in Ngene unlabeled orthogonal des

Postby Michiel Bliemer » Sun Aug 12, 2018 10:18 am

1. Yes you are right, I should have changed sim back to seq (I was playing around with some settings). Setting it to seq indeed results in a more restrictive design. Using ;rows = 72 works and also ;rows = 128 works, but ;rows = 256 does not work, likely because 256 = 4*64 (where 4 refers to the number of attribute levels) such that the near-orthogonal design with 256 rows will have the same sequence characteristics as the 64 row design.

Code: Select all
        Design
        ;alts = alt1,alt2
        ;rows = 72
        ;orth = seq
        ;block =8
        ;model:
        U(alt1) =
          b1.dummy[0|0|0] * CS_Price[-3,-1,1,3]
        + b2.dummy[0|0|0] * T_Book[-3,-1,1,3]
        + b3.dummy[0|0|0] * T_Home_to_CS[-3,-1,1,3]
        + b4.dummy[0|0|0] * T_CS_to_Destination[-3,-1,1,3]
        + b5.dummy[0|0|0] * D_Home_to_Workplace[-3,-1,1,3]
        + b6.dummy[0|0|0] * Location[-3,-1,1,3]
        + b7.dummy[0|0|0] * T_Home_to_PT[-3,-1,1,3]
        + b8.dummy[0|0|0] * PC_Parking_Number[-3,-1,1,3]
        + b9.dummy[0] * Density_green[-1,1]
        + b10.dummy[0|0|0] * Housing_type[-3,-1,1,3]
        + b11.dummy[0]* Housing_owership[-1,1]
        + b12.dummy[0|0|0] * Living_size[-3,-1,1,3]
        + b13.dummy[0|0|0] * Housing_cost[-3,-1,1,3]
        + b14.dummy[0|0|0]* Housing_built_time[-3,-1,1,3]
        + b15.dummy[0]* Safe[-1,1]/

        U(alt2) =
          b1 * CS_Price
        + b2 * T_Book
        + b3 * T_Home_to_CS
        + b4 * T_CS_to_Destination
        + b5 * D_Home_to_Workplace
        + b6 * Location
        + b7 * T_Home_to_PT
        + b8 * PC_Parking_Number
        + b9 * Density_green
        + b10* Housing_type
        + b11* Housing_owership
        + b12* Living_size
        + b13* Housing_cost
        + b14* Housing_built_time
        + b15* Safe $


2. Looking at the covariance matrix reported by Ngene for my sequential orthogonal design with 64 rows, you will have no issues dummy coding the attributes corresponding to b9, b10, b11, b12, and b15, but you will likely not be able to estimate dummy codes for the other attributes since they are not identifiable.

Michiel
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm

Re: a possible fatal error in Ngene unlabeled orthogonal des

Postby J.wang » Sun Aug 12, 2018 4:41 pm

Dear Michiel:

Don’t you think the “seq” syntax which results in “3,3,-3,-3,3 etc.” or “1,1,1,-1,-1,-1 etc” combination is not reasonable?
This combination directly leads to the failing of dummy/effect coding estimation.
J.wang
 
Posts: 6
Joined: Fri Aug 10, 2018 1:20 am

Re: a possible fatal error in Ngene unlabeled orthogonal des

Postby J.wang » Mon Aug 13, 2018 1:10 am

As this reason "because 256 = 4*64 (where 4 refers to the number of attribute levels) such that the near-orthogonal design with 256 rows will have the same sequence characteristics as the 64 row design"
but why 128=2*64 doesn't have the same characteristics as the 64 row design?

So again. Don’t you think the “seq” syntax which results in “3,3,-3,-3,3 etc.” or “1,1,1,-1,-1,-1 etc” combination is not reasonable?
This combination directly leads to the failing of dummy/effect coding estimation.
J.wang
 
Posts: 6
Joined: Fri Aug 10, 2018 1:20 am

Re: a possible fatal error in Ngene unlabeled orthogonal des

Postby Michiel Bliemer » Mon Aug 13, 2018 9:47 am

It is the creation of 4-level attributes in a near-orthogonal design that is the issue here, there is no issue with 2-level attributes. So that is why it goes wrong with 4*64 and not with 2*64 because with 2*64 there is no such repetition.

Both the 'seq' and the 'sim' command may create these repetitions, it is an artefact of the near-orthogonal design that has no correlations but does not guarantee that each attribute combination appears equally. For linearly coded attributes, for which orthogonal designs are typically used, it is the 'no correlation' part that is generally considered most important and the constructed design is fine for estimating the model that you specified (and the combinations that you refer to are not unreasonable). However, it is a different story when apply nonlinear dummy coding. It is the combination of a our near-orthogonal designs (assuming a linear coded model) combined with your estimation of a nonlinear coded model that turns out problematic in this case. So I agree that in this case the near-orthogonal design is not suitable for estimating a model with dummy coding.

We will contact you directly to further discuss your specific issue, but in order to avoid such situations in the future we will consider not generating near-orthogonal designs in Ngene when the user specifies a model with nonlinear coding, or at least give a warning when the MNL properties show that the parameters are not identifiable.

Michiel
Michiel Bliemer
 
Posts: 1885
Joined: Tue Mar 31, 2009 4:13 pm


Return to Choice experiments - general

Who is online

Users browsing this forum: No registered users and 15 guests

cron