Do gradients flow through operations performed on TensorFlow variables across session.run calls? Persistent graphs?
My understanding is that TensorFlow variables don't do this — is there a way to maintain a partially computed graph persistently across session.run
calls?
partial_run
stores a partially computed graph, but it can only be used once and is not persistent. Variables on the other hand are persistent, but as far as I'm aware do not store the graph of operations that led up to them.
Just to make my question more clear: if I have a matrix of TensorFlow variables and perform some operations on that matrix (say, using assign
or scatter_update
), would the operations that led up to the new matrix be stored in the computation graph and allow gradients to flow through?
I'm aware this would make TensorFlow far more dynamic than it probably is.
See also questions close to this topic
-
CUDA_ERROR_OUT_OF_MEMORY tensorflow
As part of my study project, I try to train a neural network which makes a segmentation on images (based on FCN), and during the execution I received the following error message:
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1,67,1066,718] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Note that I have fixed the batch_size to 1 and I have the same error even when I tried different image sizes , I put also just 1 image to train instead of 1600 still the same error! Could you help me to solve this problem ? What is it really about ?
-
Error while importing tensorflow on server
I have installed tensorflow in my college server offline by downloading and then installing. In conda list tensorflow is there it shows. But when i try to import I get this error:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/173190025/anaconda3/lib/python3.6/site-packages/tensorflow/__init__.py", line 22, in <module> from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import File "/home/173190025/anaconda3/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 52, in <module> from tensorflow.core.framework.graph_pb2 import * File "/home/173190025/anaconda3/lib/python3.6/site-packages/tensorflow/core/framework/graph_pb2.py", line 6, in <module> from google.protobuf import descriptor as _descriptor File "/home/173190025/anaconda3/lib/python3.6/site-packages/google/protobuf/descriptor.py", line 46, in <module> from google.protobuf.pyext import _message ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /home/173190025/anaconda3/lib/python3.6/site-packages/google/protobuf/pyext/../../../../../libprotobuf.so.15)
-
Output TFRecord to Google Cloud Storage from Python
I know
tf.python_io.TFRecordWriter
has a concept of GCS, but it doesn't seem to have permissions to write to it.If I do the following:
output_path = 'gs://my-bucket-name/{}/{}.tfrecord'.format(object_name, record_name) writer = tf.python_io.TFRecordWriter(output_path) # write to writer writer.close()
then I get 401s saying "Anonymous caller does not have storage.objects.create access to my-bucket-name."
However, on the same machine, if I do
gsutil rsync -d r gs://my-bucket-name bucket-backup
, it properly syncs it, so I've authenticated properly using gcloud.How can I give
TFRecordWriter
permissions to write to GCS? I'm going to just use Google's GCP python API for now, but I'm sure there's a way to do this using TF alone. -
Per image normalization vs Overall dataset normalization
I have a datasets of 1000 image.Using cnn For fingure gesture recognition. Should I normalize the Image by finding mean of that image only or the mean of entire dataset...and also suggest which library to use in python for the same
-
Softmax not resulting in a probability distribution in Python Implementation
I have a simple softmax implementation:
softmax = np.exp(x) / np.sum(np.exp(x), axis=0)
For x set as array here: https://justpaste.it/6wis7
You can load it as:
import numpy as np x = np.as (just copy and paste the content (starting from array))
I get:
softmax.mean(axis=0).shape (100,) # now all elements must be 1.0 here, since its a probability softmax.mean(axis=0) # all elements are not 1 array([0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158, 0.05263158])
Why is this implementation wrong? How to fix it?
-
Ranges with Dual Numbers
I am having an issue dealing with Dual numbers inside of ranges. Specifically:
using ForwardDiff: Dual t = Dual.((0.0,10.0),0) (t[1]:1/60:t[2])[end]
The issue seems to be that
[end]
useslast
which then what's to compute the number of steps, so something is trying to convert to anInteger
and fails. Does anyone know a way around this? -
Is there a way to stop Fortran compiler from checking if negative arguments are passed to SQRT function?
I am trying to use a third party automatic differentiation module, ADF95, which uses the expression
-sqrt(asin(-1.0_dpr))
to return a Not-a-Number (NaN) in specific cases, wheredpr
is defined usinginteger, parameter :: dpr = KIND(1.D0)
.Upon attempting to compile a simple test program which uses the
mod_adf95
module, with the commandgfortran mod_adf95.f90 main.f90 -o test.exe
, I get several errors, such as:mod_adf95.f90:2511:36: f%deriv(1:lena) = -sqrt(asin(-1.0_dpr))*sign(1.0_dpr,a%value) 1 Error: Argument of SQRT at (1) has a negative value
Clearly the square root of a negative
real
is undefined, and so I see why they would try to use this expression to get a NaN. So is there a way to tell the compiler to ignore these errors? -
Update step in PyTorch implementation of Newton's method
I'm trying to get some insight into how PyTorch works by implementing Newton's method for solving x = cos(x). Here's a version that works:
x = Variable(DoubleTensor([1]), requires_grad=True) for i in range(5): y = x - torch.cos(x) y.backward() x = Variable(x.data - y.data/x.grad.data, requires_grad=True) print(x.data) # tensor([0.7390851332151607], dtype=torch.float64) (correct)
This code seems inelegant (inefficient?) to me since it's recreating the entire computational graph during each step of the
for
loop (right?). I tried to avoid this by simply updating the data held by each of the variables instead of recreating them:x = Variable(DoubleTensor([1]), requires_grad=True) y = x - torch.cos(x) y.backward(retain_graph=True) for i in range(5): x.data = x.data - y.data/x.grad.data y.data = x.data - torch.cos(x.data) y.backward(retain_graph=True) print(x.data) # tensor([0.7417889255761136], dtype=torch.float64) (wrong)
Seems like, with
DoubleTensor
s, I'm carrying enough digits of precision to rule out round-off error. So where's the error coming from?Possibly related: The above snippet breaks without the
retain_graph=True
flag set at every step if thefor
loop. The error message I get if I omit it within the loop --- but retain it on line 3 --- is: RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time. This seems like evidence that I'm misunderstanding something... -
How Weight update in Dynamic Computation Graph of pytorch works?
How does the Weight Update works in Pytorch code of Dynamic Computation Graph when Weights are shard(=reused multiple times)
import random import torch class DynamicNet(torch.nn.Module): def __init__(self, D_in, H, D_out): """ In the constructor we construct three nn.Linear instances that we will use in the forward pass. """ super(DynamicNet, self).__init__() self.input_linear = torch.nn.Linear(D_in, H) self.middle_linear = torch.nn.Linear(H, H) self.output_linear = torch.nn.Linear(H, D_out) def forward(self, x): """ For the forward pass of the model, we randomly choose either 0, 1, 2, or 3 and reuse the middle_linear Module that many times to compute hidden layer representations. Since each forward pass builds a dynamic computation graph, we can use normal Python control-flow operators like loops or conditional statements when defining the forward pass of the model. Here we also see that it is perfectly safe to reuse the same Module many times when defining a computational graph. This is a big improvement from Lua Torch, where each Module could be used only once. """ h_relu = self.input_linear(x).clamp(min=0) for _ in range(random.randint(0, 3)): h_relu = self.middle_linear(h_relu).clamp(min=0) y_pred = self.output_linear(h_relu) return y_pred
I want to know what happens to middle_linear weight at each backward which is used multiple times at a step
-
Tensorflow tf.placeholder with shape = []
I am looking at a Tensorflow code that has learning rate input to the graph using placeholder with shape = [], as below:
self.lr_placeholder = tf.placeholder(dtype=tf.float32, shape=[])
I looked at the official documentation page of Tensorflow (https://www.tensorflow.org/api_docs/python/tf/placeholder) to understand what would shape=[] mean, but could not get an explanation for the shape set to empty list. If someone can explain what does this mean.