How to track weights and gradients in a Keras custom training loop
I have defined the following custom model and training loop in Keras:
class CustomModel(keras.Model):
def train_step(self, data):
x, y = data
with tf.GradientTape() as tape:
y_pred = self(x, training=True) # Forward pass
loss = self.compiled_loss(y, y_pred, regularization_losses=self.losses)
trainable_vars = self.trainable_variables
gradients = tape.gradient(loss, trainable_vars)
self.optimizer.apply_gradients(zip(gradients, trainable_vars))
self.compiled_metrics.update_state(y, y_pred)
return {m.name: m.result() for m in self.metrics}
And I am using the following code to train the model on a simple toy data set:
inputs = keras.layers.Input(shape=(1,))
hidden = keras.layers.Dense(1, activation='tanh')(inputs)
outputs = keras.layers.Dense(1)(hidden)
x = np.arange(0, 2*np.pi, 2*np.pi/100)
y = np.sin(x)
nnmodel = CustomModel(inputs, outputs)
nnmodel.compile(optimizer=keras.optimizers.SGD(lr=0.1), loss="mse", metrics=["mae"])
nnmodel.fit(x, y, batch_size=100, epochs=2000)
I want to be able to see the values of the gradient
and the trainable_vars
variables in the train_step
function for each training loop, and I am not sure how to do this.
I have tried to set a break point inside the train_step
function in my python IDE and expecting it to stop at the break point for each epoch of the training after I call model.fit()
but this didn't happen. I also tried to have them print out the values in the log after each epoch but I am not sure how to achieve this.
See also questions close to this topic

after converting code to exe comparison between 'import x' and 'from x import y'?
i have a simple question someone made me think of?, Is it better to use 'from x import y' so after converting the code to .exe it imports less things so it's better in performance?

Conversion of curl command to python requests
I am trying to convert a curl command to python requests(using urllib). This is for Promotheus push gateway : https://github.com/prometheus/pushgateway
a data can be posted to the link in the following manner using curl : some_metric{label="val1"} 42
Example: CountRec{studentname="John", subject="English" ,subjectcode="DU12345678999",system="pythonETL"} 40
But I am not getting how to post using python along with the value 40
Following is the code i tried : url='http://host:port/metrics/job/jobname/instance/host:port'
myRestRequestObj = urllib.request.Request(url) myRestRequestObj.add_header('ContentType','application/json') myRestRequestObj.get_method = lambda : 'PUT' myStringJson={'CountRec{"studentname":"John", "subject"="English","subjectcode"="DU12345678999","system"="pythonETL"} 40}' data=urllib.parse.urlencode(myStringJson) data1 = data.encode('ascii') res = urllib.request.urlopen(myRestRequestObj,data1) rest_txt=res.read().decode('utf8') return ast.literal_eval(rest_txt)
But i am facing error saying that not a valid nonstring sequence or mapping object.
Please help me solve this :) Thanks in advance!

I'm trying to solve some problems on codeforces, but I always get:"Wrong answer on test 1". What should I do?
This is the link to the problem I'm trying to get accepted: https://codeforces.com/problemset/problem/4/A I use Python 3.8 with VS code as text editor, and this is the code I made :
w = int(input("Insert kg: ")) if(w%2==0): if(w==2): print("NO") else: print("YES") else: print("NO")
It works but when I submit the file it says:"Running test 1..." and then "Wrong answer on test 1". I don't know how to fix this!

Tensorflow xception broadcast input array error
I'm using tensorflowgpu 2.1, and am doing image classification on 850x550 images (3 channels).
The model (preliminary) looks like this (using sequential API):
input_tensor_def = Input(shape=(850, 550, 3)) model = Sequential() xception = Xception(include_top = False, weights = None, input_tensor = input_tensor_def) model.add(xception) model.add(GlobalAvgPool2D()) model.add(Flatten()) model.add(Dense(512,activation='relu')) model.add(Dense(2,activation='softmax'))
Using the model API, it looks like this:
model_core = Xception(weights = None, include_top = False, input_tensor = input_tensor_def) model_head = model_core.output model_head = GlobalAvgPool2D()(model_head) model_head = Flatten()(model_head) model_head = Dense(512, activation = 'relu')(model_head) model_head = Dense(2, activation = 'softmax')(model_head) model = Model(inputs = model_core.input, outputs = model_head
I'm getting the following error:
ValueError: could not broadcast input array from shape (850,550,3) into shape (850,550,3,3)
I'm really confused why it's trying to interpret the height tensor as the batch index.

Increase the size of a np.array
I ran a conv1D on a X matrix of shape (2000, 20, 28) for batch size of 2000, 20 time steps and 28 features. I would like to move forward to a conv2D CNN and increase the dimensionality of my matrix to (2000, 20, 28, 10) having 10 elements for which I can build a (2000, 20, 28) X matrix. Similarly, I want to get a y array of size (2000, 10) i.e. 5 times the y array of size (2000, ) that I used to get for LSTM and Conv1D networks.
The code I used to create the 20 timesteps from input dataX, dataY, was
def LSTM_create_dataset(dataX, dataY, seq_length, step): Xs, ys = [], [] for i in range(0, len(dataX)  seq_length, step): v = dataX.iloc[i:(i + seq_length)].values Xs.append(v) ys.append(dataY.iloc[i + seq_length]) return np.array(Xs), np.array(ys)
I use this function within the loop I prepared to create the data of my conv2D NN :
for ric in rics: dataX, dataY = get_model_data(dbInput, dbList, ric, horiz, drop_rows, triggerUp1, triggerLoss, triggerUp2 = 0) dataX = get_model_cleanXset(dataX, trigger) # Clean X matrix for insufficient data Xs, ys = LSTM_create_dataset(dataX, dataY, seq_length, step) # slide over seq_length for a 3D matrix Xconv.append(Xs) yconv.append(ys) Xconv.append(Xs) yconv.append(ys)
I obtain a (10, 2000, 20, 28) Xconv matrix instead of the (2000, 20, 28, 10) targeted output matrix X and a (10, 2000) matrix y instead of the targeted (2000, 5). I know that I can easily reshape yconv with
yconv = np.reshape(yconv, (2000, 5))
. But the reshape function for XconvXconv = np.reshape(Xconv, (2000, 20, 28, 10))
seems hazardous as I cannot vizualize output and even erroneous. How could I do it safely (or could you confirm my first attempt ? Thanks a lot in advance. 
Issue with Shape in Python Neural Network
I have the following dataframe: https://raw.githubusercontent.com/markamcgown/Projects/master/df_model.csv
At "> 11 history = model.fit" in the last block of code below, I get the error "ValueError: Input 0 of layer sequential_8 is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: [None, 26]"
Why is it expecting a minimum of 3 dimensions and how can I automate my code below to always have the right shape?
import keras import pandas as pd from tensorflow.keras.models import Sequential from sklearn.model_selection import train_test_split from keras.layers import Conv2D, MaxPooling2D, Conv1D, MaxPooling1D from tensorflow.keras.layers import LSTM, Dense, Dropout, Bidirectional from keras.layers import Dense, Dropout, Flatten, Reshape, GlobalAveragePooling1D path = r'C:\Users\<your_local_directory>\df_model.csv' #Import raw data file with accelerometer data df_model = pd.read_csv(path) df_model
y_column = 'Y_COLUMN' x = df_model.drop(y_column, inplace=False, axis=1).values y = df_model[y_column].values x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.1, random_state=42)
def create_model(num_features, num_classes, dropout=0.3, loss="mean_absolute_error", optimizer="rmsprop"): model = Sequential() model.add(Conv1D(100, 10, activation='relu', input_shape=(None,num_features))) model.add(Conv1D(100, 10, activation='relu')) model.add(MaxPooling1D(2)) model.add(Conv1D(160, 10, activation='relu')) model.add(Conv1D(160, 10, activation='relu')) model.add(LSTM(160, return_sequences=True)) model.add(LSTM(160, return_sequences=True)) model.add(GlobalAveragePooling1D()) model.add(Dropout(dropout)) model.add(Dense(num_classes, activation='softmax')) model.compile(loss=loss, metrics=["mean_absolute_error"], optimizer=optimizer) return model
DROPOUT = 0.4 LOSS = "huber_loss" OPTIMIZER = "adam" num_time_periods, num_features = x_train.shape[0], x_train.shape[1] model = create_model(num_features, num_classes=len(set(df_model[y_column])), loss=LOSS, dropout=DROPOUT, optimizer=OPTIMIZER)
callbacks_list = [keras.callbacks.ModelCheckpoint(filepath='best_model.{epoch:02d}{val_loss:.2f}.h5',monitor='val_loss', save_best_only=True),keras.callbacks.EarlyStopping(monitor='accuracy', patience=1)] model.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy']) # Hyperparameters BATCH_SIZE = 400 EPOCHS = 1 # Enable validation to use ModelCheckpoint and EarlyStopping callbacks. history = model.fit(x_train,y_train,batch_size=BATCH_SIZE,epochs=EPOCHS,callbacks=callbacks_list,validation_split=0.2,verbose=1) plt.figure(figsize=(15, 4)) plt.plot(history.history['accuracy'], "g", label="Training Accuracy") #plt.plot(history.history['val_accuracy'], "g", label="Accuracy of validation data") plt.plot(history.history['loss'], "r", label="Training Loss") #plt.plot(history.history['val_loss'], "r", label="Loss of validation data") plt.title('Model Performance') plt.ylabel('Accuracy & Loss') plt.xlabel('Epoch') plt.ylim(0) plt.legend() plt.show()

How to calculate the number of MACs and of a Conv, FC and depthwiseseparable conv Layer?
Does someone know how to calculate the number of MACs of AlexNet and the number of MACs of a depthwiseseparable conv Layer for MobileNet for example?
I don't have problems with the number of wights actually but I don't unterstand how to do it for the MACs.
Weights of AlexNet: conv1: 11113*96 + 96 = 34944
conv2: 5596*256 + 256 = 614656
conv3: 33256*384 + 384 = 885120
conv4: 33384*384 + 384 = 1327488
conv5: 33384*256 + 256 = 884992
fc1: 66256*4096 + 4096 = 37752832
fc2: 4096*4096 + 4096 = 16781312
fc3: 4096*1000 + 1000 = 4097000
And the total results is 62378344 parameters.

roc_auc_score score for PyTorch model using probability estimates
I'm currently doing something like this to get the AUC score for a binary classifier. Because I'm using the y_predicted (0 or 1) instead of the probabilities, I'm finding that the AUC scores are equal to the detector accuracy.
outputs = model(inputs) y_score, y_predicted = torch.max(outputs.data, 1) y_true = targets auc = roc_auc_score(y_true, y_predicted)
 Is it typical to find AUC score equals accuracy when constructing a ROC curve based on predicted labels (0 or 1) instead of a probability score?
 How can I use probability scores instead of the predicted labels? The
roc_auc_score
function expects probability scores where [0, 0.5) corresponds to class 0 and [0.5, 1] corresponds to class 1. For example:
>>> import numpy as np >>> from sklearn.metrics import roc_auc_score >>> y_true = np.array([0, 0, 1, 1]) >>> y_scores = np.array([0.1, 0.4, 0.35, 0.8]) >>> roc_auc_score(y_true, y_scores) 0.75

Why does keras use so much memory?
I was recently trying to improve the memory efficiency of one of my models, and in testing it out I happened across this fact: Any time you make anything in Keras it uses up at least 3.1GB of memory.
For instance, I tried this piece of code here:
from time import sleep from keras import Input from keras.layers import Dense def make_neural_network(): input = Input(shape=(1,)) output = Dense(1)(input) return output sleep(3) make_neural_network() sleep(6)
Then looked at my task manager, and for the first 3 seconds I saw nothing, until the program started making the neural network at which point my GPU memory spiked for 6 seconds to 3.1GB.
I was wondering why it takes up so much memory... Any tips to reduce it if possible would be greatly appreciated.

CuDNNGRU vs GRU on google Colab gpu
After having the runtime option changed to GPU, I tried both CuDNNGRU and GRU. It was said CuDNNGRU should be (much) faster than GRU in GPU environment, however I experienced the opposite. In my case using GRU the %%time gives me estimated 7 hourish, while with CuDNNGRU the %%time estimates >8 hours or even 11 hour in another attempt. Am I using Colab GPU incorrectly?
tf.compat.v1.keras.layers.CuDNNGRU
tf.keras.layers.GRU

Unable to add fourth convolutional layer
I'm pretty new to maching learning and when I was looking at a tutorial for a convolutional neural network I wanted to experiment on my own on how to increase accuracy. However, when I tried to add another convolutional and pooling layer to my model it displayed an error message. This is before I added the layer:
model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.Flatten()) model.add(layers.Dense(64, activation='relu')) model.add(layers.Dense(62))
And this is after:
model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.Flatten()) model.add(layers.Dense(64, activation='relu')) model.add(layers.Dense(62))
This is the error message it gave me:
ValueError: Negative dimension size caused by subtracting 3 from 1 for '{{node conv2d_36/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], explicit_paddings=[], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](max_pooling2d_26/MaxPool, conv2d_36/Conv2D/ReadVariableOp)' with input shapes: [?,1,1,64], [3,3,64,64]. site:stackoverflow.com

Is there any way on classifying images as good resolution image or bad resolution image in deep learning
I want to classify my scanned images as good or bad based on the resolution. (My scanned images are mostly pages of books and text document images) Some images are blurred and some images are of very low pixel( If we enlarge the image it breaks down into various different pixel values ) So is there any deep learning method which returns me some values based on the resolution of a image. (Looking for something like super resolution on images)