Specifying Reference Category for Outcome Variable in Multinomial Logit Using SKlearn's LogisticRegression

I am trying to fit a multinomial logit model using LogisticRegression module from Sklearn.

My outcome (y) has 4 levels. I need to specify one of these levels as the reference category (or baseline). Does the LogisticRegression module provides a way of specifying this reference category?

1 answer

  • answered 2022-04-28 16:43 njwfish

    LogisticRegression for multiple classes in sklearn uses either one vs all or a softmax parameterization of the problem, depending on whether you specify multinomial. In either case it does not compute the solution using a reference, but instead computes a vector of coefficients for each output class. If you use the multinomial specification you can select the coefficients corresponding to the reference category you would like to set and subtract that from the others, which should recover an equivalent solution to the one you seem to want.

    See the docs for how to specify multinomial: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum