Do gradients flow through operations performed on TensorFlow variables across session.run calls? Persistent graphs?
My understanding is that TensorFlow variables don't do this — is there a way to maintain a partially computed graph persistently across session.run
calls?
partial_run
stores a partially computed graph, but it can only be used once and is not persistent. Variables on the other hand are persistent, but as far as I'm aware do not store the graph of operations that led up to them.
Just to make my question more clear: if I have a matrix of TensorFlow variables and perform some operations on that matrix (say, using assign
or scatter_update
), would the operations that led up to the new matrix be stored in the computation graph and allow gradients to flow through?
I'm aware this would make TensorFlow far more dynamic than it probably is.
See also questions close to this topic

Detecting small objects with MobileNet and Faster RCNN
I'm working on object detection of various sorts of animals using the Tensorflow Object Detection API. In the past I successfully applied MobileNet v1 to various settings and I used to be happy with the results.
Now, I encountered a problem with a new species that is about 1/3 smaller than animals I dealt with before. Visually, the animals look the same up to a scale, meaning that the bounding boxes to be predicted are rather in the range of 515% of the image size than 20%30% as before.
I have the feeling there should be some hyperparameter I need to tweak in order to get stuff back to working, but I struggle to find the right one the pipeline config. I already experimented with tuning min_scale and max_scale of the anchor_generator towards smaller values, but with no success.
Interestingly, using Faster RCNN works right away on the exact same data.
Any ideas what could be tried?

How can I install tensorflow in order to use tflearn modules?
I'm setting up a script and I need to use tensorflow. How can I install it in order to import it into my scripts?
import tensorflow as tf
and I obtain:
ModuleNotFoundError: No module named 'tensorflow'

Memory issue with data splitting CNN
im facing memory error, trying to run a code for data splitting into train validation and test sets, I have like 50000 images to use in CNN, with less images it works, but I would like to keep the 50000 images, is there any way of doing that? I have reed that with batches, but I dont know how to implement that, im kind of new, thank you so much
N_CLASSES = 2 # Generates training, validation and testing files def gen_data(): X_train = [] Y_train = [] X_valid = [] Y_valid = [] X_test = [] Y_test = [] count=0 for i in species: # Samples Location train_samples = join(datapath, 'traind/'+i) valid_samples = join(datapath, 'valid/' +i) test_samples = join(datapath, 'test/'+i) # Samples Files train_files = listdir(train_samples) valid_files = listdir(valid_samples) test_files = listdir(test_samples) # Sorting according to the number train_files.sort(key=lambda f: int(''.join(filter(str.isdigit, f)))) valid_files.sort(key=lambda f: int(''.join(filter(str.isdigit, f)))) test_files.sort(key=lambda f: int(''.join(filter(str.isdigit, f)))) for j in train_files: im = join(train_samples, j) img = cv2.imread(im,1) img = cv2.resize(img, (416,416)) X_train.append(img) Y_train+=[count] for k in test_files: im = join(test_samples, k) img = cv2.imread(im,1) img = cv2.resize(img, (416, 416)) X_test.append(img) Y_test+=[count] for l in valid_files: im = join(valid_samples, l) img = cv2.imread(im,1) img = cv2.resize(img, (416, 416)) X_valid.append(img) Y_valid+=[count] count+=1 X_train = np.asarray(X_train).astype('float32') X_train/= 255 Y_train = np.asarray(Y_train) X_valid = np.asarray(X_valid).astype('float32') X_valid/= 255 Y_valid = np.asarray(Y_valid) X_test = np.asarray(X_test).astype('float32') X_test /= 255 Y_test = np.asarray(Y_test) return X_train, Y_train, X_valid, Y_valid, X_test, Y_test if __name__ == '__main__': # makedirs(final_1) # makedirs(final_2) # for i in species: # makedirs(join(final_1, i)) # makedirs(join(final_2, i)) # create_validation() # create_test() x_train, y_train, x_valid, y_valid, x_test, y_test = gen_data() y_train = np_utils.to_categorical(y_train, N_CLASSES) y_valid = np_utils.to_categorical(y_train, N_CLASSES) y_test = np_utils.to_categorical(y_test, N_CLASSES) np.save('X_train.npy', x_train) np.save('Y_train.npy', y_train) np.save('X_valid.npy', x_valid) np.save('Y_valid.npy', y_valid) np.save('X_test.npy', x_test) np.save('Y_test.npy', y_test)

How to pass arguments to already loaded tensorflow graph ( in memory )
I have an object detection model trained using ssdmobilenet architecture. I am driving inference in real time from this model using my webcam. The output is a bounding box overlayed on the image from the webcam.
I am accessing my web cam as follows:
import cv2 cap = cv2.VideoCapture(0)
Function to run inference in realtime on video feed:
with detection_graph.as_default(): with tf.Session(graph=detection_graph) as sess: while True: ret, image_np = cap.read() # Expand dimensions since the model expects images to have shape: [1, None, None, 3] image_np_expanded = np.expand_dims(image_np, axis=0) image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') # Each box represents a part of the image where a particular object was detected. boxes = detection_graph.get_tensor_by_name('detection_boxes:0') # Each score represent how level of confidence for each of the objects. # Score is shown on the result image, together with the class label. scores = detection_graph.get_tensor_by_name('detection_scores:0') classes = detection_graph.get_tensor_by_name('detection_classes:0') num_detections = detection_graph.get_tensor_by_name('num_detections:0') # Actual detection. (boxes, scores, classes, num_detections) = sess.run( [boxes, scores, classes, num_detections], feed_dict={image_tensor: image_np_expanded}) # Visualization of the results of a detection. vis_util.visualize_boxes_and_labels_on_image_array( image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8) #print(boxes) for i, box in enumerate(np.squeeze(boxes)): if(np.squeeze(scores)[i] > 0.98): print("ymin={}, xmin={}, ymax={}, xmax{}".format(box[0]*height,box[1]*width,box[2]*height,box[3]*width)) break cv2.imshow('object detection', cv2.resize(image_np, (300,300))) if cv2.waitKey(25) & 0xFF == ord('q'): cv2.destroyAllWindows() break
The moment object is detected my terminal shows its normalised coordinates.
This is perfect for a video feed because:
 The model is already loaded in memory
 whenever new object comes in front of webcam the loaded model predicts that object and outputs its coordinates
I want the same functionality for image i.e. I want:
 The model already loaded in memory
 whenever new argument comes mentioning the image location, the loaded model predicts that object and outputs its coordinates.
How should I do that by modifying above code? I do not want a separate server to perform this task (as mentioned in tensorflow serving).
How do I do it locally on my machine ?

How to pass epoch and batch size when using label powerset in keras
I have a multilabel problem and with some research, I was able to use Label powerset in conjunction with ML algorithms. Now I want to use the Label powerset with neural network and as per the official website I can use Label powerset. But I am not able to understand how to modify my existing code to be able to use Label Powerset.
I want to know how can we pass epoch or batch_size or any other parameter passed in the fit function of the model.
Since I have a multilabel problem I have used MultiLabelBinarizer of sklearn so my each target row looks like this [1,0,0,1,0,0,0,0,0,0,0,0].
and lastly, if someone could explain to me what is KERAS_PARAMS and Keras() in the below line:
def create_model_multiclass(input_dim, output_dim): # create model model = Sequential() model.add(Dense(8, input_dim=input_dim, activation='relu')) model.add(Dense(output_dim, activation='softmax')) # Compile model model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) return model clf = LabelPowerset(classifier=Keras(create_model_multiclass, True, KERAS_PARAMS), require_dense=[True,True]) clf.fit(X_train,y_train) y_pred = clf.predict(X_test)
Below is my existing neural network code
cnn_model = Sequential() cnn_model.add(Dropout(0.5)) cnn_model.add(Conv1D(25,7,activation='relu')) cnn_model.add(MaxPool1D(2)) cnn_model.add(Dropout(0.2)) cnn_model.add(Conv1D(25,7,activation='relu')) cnn_model.add(MaxPool1D(2)) cnn_model.add(Flatten()) cnn_model.add(Dense(25,activation='relu')) cnn_model.add(Dense(12,activation='softmax')) cnn_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['acc']) history = cnn_model.fit(X_train, y_train, validation_data=(X_test,y_test), batch_size=32, epochs=180,verbose=1) plot_history(history) predictions = cnn_model.predict(X_test)
I want my output row to look like this only [1,0,0,1,0,0,0,0,0,0,0,0] as later I will use my MultiLabelBinarizer for the inverse transform of this.

Loss functions reaching global minima
In Deep Learning can we have train accuracy far less than 100% at the global minimum of the loss function?
I have coded a neural network in python to classify cats and noncats. I chose a 2layer network. It gave a train accuracy of 100% and a test accuracy of 70%.
When I increased the #layers to 4 the loss function is getting stuck at 0.6440 leading to train accuracy of 65% and a test accuracy of 34% for many random initializations.
We are expecting that our train accuracy on the 4layer model should be 100%. But we are getting stuck at 65%. We are thinking that the loss function is reaching a global minimum since on many random initialization we are stagnating at a loss value of 0.6440. So, even though the loss function is reaching the global minimum, why is the train accuracy not reaching 100%? Hence our question,"In Deep Learning can we have train accuracy nonzero at the global minimum of the loss function?"

Does Tensorflow replicates nodes to compute the gradient?
I need to compute the gradient of a complicated loss function in Tensorflow.
Let's assume for simplicity that I have
J = tf.reduce_sum(A * b  target) optimize = minimize(J)
Where b and target are constant tensors and A is a square matrix (5000x5000) whose entries are the mean of 10 outputs of a small neural network (1 layer, 50 neurons).
The only parameters here are neural network ones.
For forward computation of J, I have no problems. But when trying to optimize the parameters with sess.run([optimize]) I am running into memory issues. Namely I get an error:
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[10,5000,5000] and type double on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu
My question is why does this happen? Does tensorflow duplicate the computation nodes to compute the gradient? That could somehow explain this issue.
Extra info: my machine has 16GB of RAM.

Obtaining code for analytic derivative on Matlab
I have one big analytical function
myfunc.m
for which I need to obtain the derivative in code formatd_myfunc_dx.m
. The problem constraints is that I need to produce code that I can refactor, so I can meet some quality requirements, and that the function itself has a few particularities, which are reproduced by the minimal example below.function[out] = myfunc(x,a, max_iterations) count = 0; theta = 0; delta = 1; while (count<max_iterations && abs(delta)<eps) cout = count+1; delta = theta  cos(theta*x); theta = theta  delta; end if(theta<a) out = theta+x else out = x end end
Ideally, the derivative function should look like this:
function[d_out_dx] = myfunc(x,a, max_iterations) count = 0; theta = 0; delta = 1; while (count<max_iterations && abs(delta)<eps) cout = count+1; delta = theta  cos(theta*x); theta = theta  delta; end %assuming convergence: theta = cos(theta*x): d_theta_dx = theta*sin(theta*x); if(theta>a) d_out_dx = d_theta_dx+1 else d_out_dx = 1 end end
I.e. That's readable, so I can refactor it, it contains the branches which were present on the original function. The edge case
theta=a
is ignored which is actually desirable.But I'm not expecting any tool to assume the convergence of the iterative method and perform an implicit derivative. So I'd be satisfied if I got something like:
function[d_out_dx] = myfunc(x,a, max_iterations) count = 0; theta = 0; delta = 1; d_theta_dx = 0; while (count<max_iterations && abs(delta) < eps) cout = count+1; delta = theta  cos(theta*x); d_delta_dx = d_theta_dx + theta*sin(theta*x) theta = theta  delta; d_theta_dx = d_theta_dx  d_delta_dx end if(theta>a) d_out_dx = d_theta_dx+1 else d_out_dx = 1 end end
Always nice if I could find some free tool for this job.
I'm also required to test the code checked independently, so I'll know if anything didn't went well. No concern about reliability of tool. Because of the standards employed during refactor, I won't worry about safety of the generated code (i.e. protection against zero division).
What I've tried so far:
HandCoding: Because the functions I need to work with are very large, and the process being error prone, I simply can't get a correct code. Plus, I need to derivate with respect to several variables, which means the process would be too much time consuming.
ADiMat: Introduced functions such as adimat_opdiff_mult(t_theta, theta, t_x, x), this causes me not to comply with my coding standards. And refactoring it would mean having almost the same work as handwriting the derivatives.
ADiGator: It cannot parse conditionals, so I'd need to remove branches from the target function and reobtain the derivative for each case. It created lots of intermediate variables, for which the names aren't helpful, but at least are readable.
Casadi: As far as I've experimented with it, it creates text rather than code, and within this text, the variables internal to the function lose their names and are replaced by
@1
,@2
,@3
and so on. Would be a nightmare to recode by hand and difficult to rely on a custom script to refactor it. 
CasADi: define symbolic expression from elements of symbolic matrix expression
Related to this question, is it possible to define a symbolic expression in
CasADi
(using Python wrapper) which depends on only part of a symbolic matrix expressionX = MX.sym('X', 5)
? For example, I'd like to write an expression likey = X[1:2] * X[2:3]
and calculate the derivatives ofy
with respect toX[1:2]
andX[2:3]
. 
Implementation Details on Seq2Seq Model DL4J
I am trying to implement a Seq2Seq Predictor Model in DL4J. What I ultimately want is to use a time series of
INPUT_SIZE
data points to predict the following time series ofOUTPUT_SIZE
data points using this type of model. Each data point hasnumFeatures
features. Now, DL4J has some example code explaining how to implement a very basic Seq2Seq model, but I don't understand it very well and am having difficulty extending that implementation to my case. The code I have to "fill in" is just the setting up theComputationGraph
, as once this is done properly I am confident I can do the rest.ComputationGraphConfiguration configuration = new NeuralNetConfiguration.Builder() .weightInit(WeightInit.XAVIER) .updater(new Adam(0.25)) .seed(42) .graphBuilder() .addInputs(/* FILL IN */) .setInputTypes(InputType.recurrent(numFeatures), /* More recurrent? */) .addLayer("encoder", new LSTM.Builder().nIn(numFeatures).nOut(hiddenLayerWidth).activation(Activation.SIGMOID).build(), /* FILL IN */) .addVertex("lastTimeStep", new LastTimeStepVertex(/* Name of last input? */), "encoder") .addVertex("duplicateTimeStep", new DuplicateToTimeSeriesVertex("sumOut"), "lastTimeStep") .addLayer("decoder", new LSTM.Builder().nIn(numFeatures + hiddenLayerWidth).nOut(hiddenLayerWidth).activation(Activation.SIGMOID).build(), /* FILL IN */,"duplicateTimeStep") .addLayer("output", new RnnOutputLayer.Builder().nIn(hiddenLayerWidth).nOut(numFeatures).activation(Activation.SIGMOID).lossFunction(LossFunctions.LossFunction.MSE).build(), "decoder") .setOutputs("output") .build(); ComputationGraph net = new ComputationGraph(configuration); net.init(); net.setListeners(new ScoreIterationListener(1));
So far, I have tried a few things, like making there be
INPUT_SIZE
inputs, but I keep getting various errors at different stages when I try to fit this model with some data. And that doesn't even get to how I make the appropriate outputs. I have tried reading up on theComputationGraph
documentation here, to no avail. Any advice as to how to fill in this code with the appropriate inputs/outputs would be much appreciated. I am somewhat new to DL4J and also am still wrapping my head around how Seq2Seq models work, so I apologize in advance if this is a slightly stupid question. The example Seq2Seq Model can be found here, starting at line 88. 
How to attach a tensor to a particular point in the computation graph in PyTorch?
As stated in the question, I need to attach a tensor to a particular point in the computation graph in Pytorch.
What I'm trying to do is this: while geting outputs from all minibatches, accumulate them in a list and when one epoch finishes, calculate the mean. Then, I need to calculate loss according to the mean, therefore backpropagation must consider all these operations.
I am able to do that when the training data is not much (without detaching and storing). However, this is not possible when it gets bigger. If I don't detach output tensors each time, I'm running out of GPU memories and if I detach, I lose the track of output tensors from the computation graph. Looks like this is not possible no matter how many GPUs I have since PyTorch does only use first 4 for storing output tensors if I don't detach before saving them into a list even if I assign more than 4 GPUs.
Any help is really appreciated.
Thanks.

How Weight update in Dynamic Computation Graph of pytorch works?
How does the Weight Update works in Pytorch code of Dynamic Computation Graph when Weights are shard(=reused multiple times)
import random import torch class DynamicNet(torch.nn.Module): def __init__(self, D_in, H, D_out): """ In the constructor we construct three nn.Linear instances that we will use in the forward pass. """ super(DynamicNet, self).__init__() self.input_linear = torch.nn.Linear(D_in, H) self.middle_linear = torch.nn.Linear(H, H) self.output_linear = torch.nn.Linear(H, D_out) def forward(self, x): """ For the forward pass of the model, we randomly choose either 0, 1, 2, or 3 and reuse the middle_linear Module that many times to compute hidden layer representations. Since each forward pass builds a dynamic computation graph, we can use normal Python controlflow operators like loops or conditional statements when defining the forward pass of the model. Here we also see that it is perfectly safe to reuse the same Module many times when defining a computational graph. This is a big improvement from Lua Torch, where each Module could be used only once. """ h_relu = self.input_linear(x).clamp(min=0) for _ in range(random.randint(0, 3)): h_relu = self.middle_linear(h_relu).clamp(min=0) y_pred = self.output_linear(h_relu) return y_pred
I want to know what happens to middle_linear weight at each backward which is used multiple times at a step