Clustering of Histogram with (Py)Spark for Data Reduction

I want to cluster different probability distributions in the form of histograms. I have a dataset with >10 M observations. One observation has 5 different histrograms (> 100 feautures). The goal of the clustering is data reduction by creating a codebook / prototypes with whom I can represent the distributions of the initial dataset.

Now I am not certain, what is the best method to do this. Ideas are:

How would you rate the ideas? Are they feasible? Am I overlooking a clearly more performant/easy solution? Any hints would be greatly appreciated!