Normalized Mutual Information in Tensorflow

Is that possible to implement normalized mutual information in Tensorflow? I was wondering if I can do that and if I will be able to differentiate it. Let's say that I have predictions P and labels Y in two different tensors. Is there an easy way to use normalized mutual information?

I want to do something similar to this:

https://course.ccs.neu.edu/cs6140sp15/7_locality_cluster/Assignment-6/NMI.pdf