weighted NMF on sklearn
Is there an option in NMF on sklearn to optimize with weighted cost function with some weight W?
In the documentation I dont see an option for this: http://scikitlearn.org/stable/modules/generated/sklearn.decomposition.NMF.html
See also questions close to this topic

Create a dictionary using comprehension with the contents of a tuple as the key and contents from a function as the values
I have a function that will return an integer value based on a string input. For example:
ID = GetIDFromName('name')
Because I have inputs from multiple devices, I was hoping to make a dictionary with the keys being the items in a tuple and to use the same function to provide the corresponding values using dictionary comprehension. For example
myDict = {'name1': 0, 'name2': 1}
I've tried using:
names = ('name1', 'name2')
myDict = {names: GetIDFromName(name) for name in names}
But this gives me
{('name1', 'name2'): 1}

Fast way to replace elements by zeros corresponding to zeros of another array
Suppose we have two numpy arrays
a = [ [1, 2, 0], [2, 0, 0], [3, 1, 0] ] b = [ [1, 2, 3], [4, 5, 6], [7, 8, 9]]
The goal is to set elements of b at the indices where a is 0. That is, we want to get an array
[ [1, 2, 0], [4, 0, 0], [7, 8, 0] ]
What is a fast way to achieve this?
I thought about generate a mask by $a$ first and then replace the values of b by this mask. But got lost on how to do this?

Use one .map call to change values for multiple Series/columns in dataframe using one dictionary
If I have a dataframe like so:
id latitude longitude a 99 48 b 97 44 c 96 52
and I have a dictionary mapping the ids to new latitude+longitude values
new_lat_lon = { 'a':(99, 58), 'b':(...), 'c':(...) }
is there a quick and dirty way to use .map to change the latitude/longitude columns at once?
e.g.
df[['latitude', 'longitude']] = df['id'].map(new_lat_lon)
this doesn't work of course, but if there's a way I'd like to know. I am aware that I can simply separate the dictionary into two separate ones, but I am interested if there is a more compact solution. If I need to modify the dictionary a bit (e.g. change the tuples to lists or something) that's cool too, as long as it's one dictionary. Thank you!

Can't import sklearn
I have both numpy and scipy installed. I also have installed sklearn but can't import it. look at the Image, please.

Reverse support vector machine: calculating the predictions
I was wondering, given the regression coefficients of a svm regression model, if one could calculate the predictions 'by hand', made by that model. More precisely, suppose:
svc = SVR(kernel='rbf', epsilon=0.3, gamma=0.7, C=64) svc.fit(X_train, y_train)
then you can obtain the predictions very easily by using
y_pred = svc.predict(X_test)
I was wondering how one obtains this result by calculating it directly. Starting with the decision function, where
K
is the RBF kernel function,b
is the intercept and the alpha's are the dual coefficients.Because I work with the RBFkernel, I started like this:
def RBF(x,z,gamma,axis=None): return np.exp((gamma*np.linalg.norm(xz, axis=axis)**2)) for i in len(svc.support_): A[i] = RBF(X_train[i], X_test[0], 0.7)
Then I calculated
np.sum(svc._dual_coef_*A)+svc.intercept_
However, the result of this calculation isn't the same as the first term of
y_pred
. I suspect my reasoning isn't entirely correct and/or my code isn't what it should be, so apologies if this isn't the right board to ask. I've been staring blind at this problem for the past 2 hours, so any help would be greatly appreciated! 
label encode multiple categorical values in a row
My y_train consists of multiple ingredients. Each consists of different ingredients separated by comma. It is basically a multiclass classification problem. My y_train looks like this
df['ingredients_str'].head() 0 romaine lettuce,black olives,grape tomatoes 1 plain flour,ground pepper,salt,tomatoes 2 eggs,pepper,salt,mayonaise,cooking oil 3 water,vegetable oil,wheat,salt 4 black pepper,shallots,cornflour,cayenne Name: ingredients_str, dtype: object
I tried with sklearn label encoder to encode the categorical variables.
from sklearn import preprocessing le = preprocessing.LabelEncoder() le.fit_transform(df['ingredients_str']) 0 28560 1 26783 2 10595 3 38379 4 2798 Name: encoding, dtype: int64
How to convert that column in labelencoder?

How to get frequencies of topics of NMF in sklearn
I am now using NMF to generate topics. My code is shown below. However, I do not know how to get the frequency of each topic. Does anyone that can help me? Thank you!
def fit_tfidf(documents): tfidf = TfidfVectorizer(input = 'content', stop_words = 'english', use_idf = True, ngram_range = NGRAM_RANGE,lowercase = True, max_features = MAX_FEATURES, min_df = 1 ) tfidf_matrix = tfidf.fit_transform(documents.values).toarray() tfidf_feature_names = np.array(tfidf.get_feature_names()) tfidf_reverse_lookup = {word: idx for idx, word in enumerate(tfidf_feature_names)} return tfidf_matrix, tfidf_reverse_lookup, tfidf_feature_names def vectorization(documments): if VECTORIZER == 'tfidf': vec_matrix, vec_reverse_lookup, vec_feature_names = fit_tfidf(documents) if VECTORIZER == 'bow': vec_matrix, vec_reverse_lookup, vec_feature_names = fit_bow(documents) return vec_matrix, vec_reverse_lookup, vec_feature_names def nmf_model(vec_matrix, vec_reverse_lookup, vec_feature_names, NUM_TOPICS): topic_words = [] nmf = NMF(n_components = NUM_TOPICS, random_state=3).fit(vec_matrix) for topic in nmf.components_: word_idx = np.argsort(topic)[::1][0:N_TOPIC_WORDS] topic_words.append([vec_feature_names[i] for i in word_idx]) return topic_words

NMF extract basis
I'm dealing with a dataset of 1603 genes X 16 samples by performing NMF analysis with the NMF R package.
As you can see in the basismap plot (link below) are reported my genes clustered in 3 metagene groups.
How can I extract the annotation row info? (ie the basis associated to each gene)
I tried with
extractFeatures()
. But I obtain only some of the genes that belong to a metagene(I suppose the highly associated) 
visualization for output of topic modelling
For topic modelling I use the method called nmf(Nonnegative matrix factorisation). Now, I want to visualise it.So, can someone tell me visualisation techniques for topic modelling.