Python predict_proba

I have a question on a classification problem in machine learning using the log_loss function in scikit learn.

from sklearn.ensemble import RandomForestClassifier
classifier = RandomForestClassifier()
classifier.fit(Xtrain, ytrain)
soft = classifier.predict_proba(Xtest)[:,1]
log_loss = log_loss(ytest, soft)

I would to compute the log loss but an error appears :

'numpy.float64' object is not callable

I think that this problem may come from the fact that there is some 0 in the vector soft. But I do know to solve this problem ?

s = 0
for x in soft : 
    if x == 0 : 
        s+=1
print(s)
>> 17729

Thanks in advance

1 answer

  • answered 2018-11-14 12:47 Bonlenfum

    It appears as if your issue here is not really with the log_loss inputs, but just to do with your variable naming. Everything in python is an object and so in the line:

    log_loss = log_loss(ytest, soft)
    

    you assigned the answer, a number (of type numpy.float64), to the token log_loss. So your variable shadows the function. Then, subsequent calls, as if it were a function, fail.

    from sklearn.metrics import log_loss
    print(log_loss)
    >>> <function log_loss at 0x7f9f692db1b8>
    
    log_loss = log_loss(ytest, soft)
    print(log_loss)
    >>> 0.11895972559889094
    log_loss = log_loss(ytest, soft)
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-40-b423b2324b92> in <module>()
    ----> 1 log_loss = log_loss(ytest, soft)
    
    TypeError: 'numpy.float64' object is not callable
    

    Simplest resolution is not to call your variable log_loss, but more generally you might find some level of namespacing helps, e.g. instead of

    from sklearn.metrics import log_loss
    ...
    loss = log_loss(ytest, soft)
    

    you could use

    from sklearn import metrics
    ...
    loss = metrics.log_loss(ytest, soft)