R2 of an Unlabeled Experiment
Posted: Fri Oct 09, 2020 1:00 pm
Hi everyone!
I understand that when calibrating a model using logistic regression with data from a binary unlabeled choice experiment, I should include no intercept in the model specification, since there should be no alternative specific constant. I usually use GLM in R to fit my models, and then I use the deviance and the null.deviance reported by the GLM object to calculate the pseudo R2 (1 - deviance/null.deviance). I noticed that the model without intercept was resulting in a much higher R2 than the same model with the intercept. This seems to happen because the null.deviance reported by GLM is different (higher) in the model without intercept. So I was wondering:
1- How does GLM calculate the null deviance in the model without intercept? I mean, what could be the "null model" in this case, since it doesn't seem to be the model with only a constant term?
2- Should I fit a model with only a constant term in another GLM object, get its deviance and use it as the null deviance to be compared to the deviance of the model I am calibrating? Or that doesn't make sense since the intercept wouldn't have any meaning?
I'm not very familiar with Stata, but I was using it today and it seems that when I force the model to be without intercept, the software doesn't report any R2 (which it normally does). I unfortunately don't have access to NLOGIT to see if the software reports a R2 for models without constants...
In the end, I think my doubt is if it's possible to calculate a R2 for models that are derived from unlabeled experiments, and, if so, how it's done.
I sincerely apologize if this type of question is not allowed in the forum, but I'm really in doubt and I don't think there's a better place in the internet for me to find an answer.
Since now, thank you very much to anyone who can help me with this one!
I understand that when calibrating a model using logistic regression with data from a binary unlabeled choice experiment, I should include no intercept in the model specification, since there should be no alternative specific constant. I usually use GLM in R to fit my models, and then I use the deviance and the null.deviance reported by the GLM object to calculate the pseudo R2 (1 - deviance/null.deviance). I noticed that the model without intercept was resulting in a much higher R2 than the same model with the intercept. This seems to happen because the null.deviance reported by GLM is different (higher) in the model without intercept. So I was wondering:
1- How does GLM calculate the null deviance in the model without intercept? I mean, what could be the "null model" in this case, since it doesn't seem to be the model with only a constant term?
2- Should I fit a model with only a constant term in another GLM object, get its deviance and use it as the null deviance to be compared to the deviance of the model I am calibrating? Or that doesn't make sense since the intercept wouldn't have any meaning?
I'm not very familiar with Stata, but I was using it today and it seems that when I force the model to be without intercept, the software doesn't report any R2 (which it normally does). I unfortunately don't have access to NLOGIT to see if the software reports a R2 for models without constants...
In the end, I think my doubt is if it's possible to calculate a R2 for models that are derived from unlabeled experiments, and, if so, how it's done.
I sincerely apologize if this type of question is not allowed in the forum, but I'm really in doubt and I don't think there's a better place in the internet for me to find an answer.
Since now, thank you very much to anyone who can help me with this one!