is there any difference between matmul and usual multiplication of tensors
I am confused between the multiplication between two tensors using * and matmul. Below is my code
import torch
torch.manual_seed(7)
features = torch.randn((2, 5))
weights = torch.randn_like(features)
here, i want to multiply weights and features. so, one way to do it is as follows
print(torch.sum(features * weights))
Output:
tensor(2.6123)
Another way to do is using matmul
print(torch.mm(features,weights.view((5,2))))
but, here output is
tensor([[ 2.8089, 4.6439],
[2.3988, 1.9238]])
What i don't understand here is that why matmul
and usual multiplication are giving different outputs, when both are same. Am i doing anything wrong here?
Edit: When, i am using feature of shape (1,5)
both * and matmul
outputs are same.
but, its not the same when the shape is (2,5)
.
1 answer

When you use
*
, the multiplication is elementwise, when you usetorch.mm
it is matrix multiplication.Example:
a = torch.rand(2,5) b = torch.rand(2,5) result = a*b
result
will be shaped the same asa
orb
i.e(2,5)
whereas considering operationresult = torch.mm(a,b)
It will give a size mismatch error, as this is proper matrix multiplication (as we study in linear algebra) and
a.shape[1] != b.shape[0]
. When you apply the view operation intorch.mm
you are trying to match the dimensions.In the special case of the shape in some particular dimension being 1, it becomes a dot product and hence
sum (a*b)
is same asmm(a, b.view(5,1))
See also questions close to this topic

Pandas  combine two Series by all unique combinations
Let's say I have the following series:
0 A 1 B 2 C dtype: object 0 1 1 2 2 3 3 4 dtype: int64
How can I merge them to create an empty dataframe with every possible combination of values, like this:
letter number 0 A 1 1 A 2 2 A 3 3 A 4 4 B 1 5 B 2 6 B 3 7 B 4 8 C 1 9 C 2 10 C 3 11 C 4

Reading a json file in python throws error
I have a json file named data.json. I am trying to read it by following different methods.
with open('data.json', 'r') as f: data_dict = json.load(f) with open("data.json") as data: data_json = data.read() z = json.loads(data_json) data= json.loads(open('data.json').read())
In all of these three ways, i get the same error:
> 3 z = json.loads(data) ~\Anaconda3\lib\json\__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) 346 parse_int is None and parse_float is None and 347 parse_constant is None and object_pairs_hook is None and not kw): > 348 return _default_decoder.decode(s) 349 if cls is None: 350 cls = JSONDecoder ~\Anaconda3\lib\json\decoder.py in decode(self, s, _w) 335 336 """ > 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 338 end = _w(s, end).end() 339 if end != len(s): ~\Anaconda3\lib\json\decoder.py in raw_decode(self, s, idx) 353 obj, end = self.scan_once(s, idx) 354 except StopIteration as err: > 355 raise JSONDecodeError("Expecting value", s, err.value) from None 356 return obj, end JSONDecodeError: Expecting value: line 11 column 13 (char 171)
My json file is like a list of dictionaries.

How to create a web UI that captures login details?
I have a web based tool (Like PowToons) with only 4 licenses. But the number of users are 12. Only 4 of them can use the tool simultaneously. I want to create a web UI that tracks the user login details to this tool “X” Along with following details:
No. of hours the user needs to use “X” : Team Manager’s name:
I want to know how can I go about this idea.

Scipy Shift function losing precision
from scipy.ndimage.interpolation import shift a = np.array([0., 1.]) shift_left = shift(a, 1, cval=np.NaN) shift_right = shift(a, 1, cval=np.NaN) print(shift_left) print(shift_right)
Here are the results from the code above
[ 1. nan] [ nan 8.32667268e17]
Here is what I would expect the results to be
[ 1. nan] [ nan 0.]
Is there a reason for this loss of precision? Does anyone know what could be causing this issue and how I could fix it? Seems to happen when I shift arrays that contain the value 0, although this could be happening with other cases for all I know.

Distance between point and parametric line (origin, orientation) in ndimensional space
Question
Given a parametric line (of arbitrary dimensions), passing through
C
and with direction vectorD
, how do I calculate, in Python (with the use of thenumpy
package if required), the minimal/perpendicular distance between this (infinitely reaching) line, and an arbitrary data point,R
?For example, if I have a line that passes through
x=2, y=2
(i.e.(2, 2)
), with direction vector[1,1]
(i.e. 45degree line passing through the origin) and a pointR=[1, 0]
, I would expect a result along the lines of0.707
as the distance (i.e.1/sqrt(2)
).I know the line could be expressed like so:
L = C + t*D
And that I need to find the point that corresponds to the minimal/zero projection of
R
ontoL
. Now I need to figure out how to takeC
,D
, and a set of points (i.e. multipleR
values) and plug it into numpy/Python.
Work So Far
I found a similar question on Stackoverflow, but it seems to be for computing the distance between a point and a line segment, rather than a line. Also, I don't know what "calculus tricks" the author is referring to.
I found another question that was similar, but again, I don't know it it refers to the distance between a point and a line segment, or a point and a line.
Lastly, there's this example, which appears to be the closest to what I need (i.e. line is defined in terms of a single point and direction vector), but it doesn't appear to have a functioning code example.
Thank you.

Multiple labels with TensorFlow dnnRegression doesn't work
I have a simple dnnRegression estimator, when I want to predict one label it is working as expected, but when I want to predict multiple labels it doesn't work for me. I know about label_dimension=3, still, it doesnt't works for me, not sure what am I doing wrong.
I have tried deleting a training folder, playing with shapes of labels, still, nothing is working, I know that target needs to be a series object but couldn't make it work with multiple labels.
def get_input_fn(data_set, data_target, num_epochs=None, n_batch=128, shuffle=False): return tf.estimator.inputs.pandas_input_fn( x=data_set, y=data_target, batch_size=n_batch, num_epochs=num_epochs, shuffle=shuffle ) _LABEL = [ 'i1', 'i2', 'i3' ] train_set = pd.DataFrame(data=d1, columns=k1, dtype=float) train_target = pd.DataFrame(data=d2, columns=k1, dtype=float) #tt = pd.Series(data=train_target.iloc[:], index=_LABEL) #train_target = tt features = list(train_set.keys()) feature_cols = [tf.feature_column.numeric_column(k) for k in features] estimator = tf.estimator.DNNRegressor( feature_columns=feature_cols, hidden_units=[1024, 512, 256], label_dimension=3, optimizer=tf.train.ProximalAdagradOptimizer( learning_rate=0.1, l1_regularization_strength=0.001 ), model_dir="train" ) estimator.train( input_fn=get_input_fn(data_set=train_set, data_target=train_target, num_epochs=None, n_batch=128, shuffle=False), steps=1000 )
I would just like to get predictions for multiple labels, i1,i2 and i3.

where do I add DCUDA_HOST_COMPILER=/usr/bin/gcc5 (Pytorch)
I am installing pytorch from the source, and am running into issues where it is using my current gcc compiler and not gcc5
I have tried to google this problem for hours, searched stack and everything.
I have tried:
exporting the C and CXX variables to no avail linking the variable
I am using arch linux

dll in pytorch failed
I'm using Windows 10 and pytorch 1 I've used pytorch successfully, however, after I loaded matplotlib the following error occurred: ImportError: DLL load failed: The specified module could not be found. for torch._C. This error persist even though I have uninstall first matplotlib, then pytorch and reinstalled pytorch. I have even created a new environment and installed pytorch into it. None of the posts I have seen helped me resolve the problem. It would appear that something has happened to the global environment.

PPO implementation in PyTorch. Problem with use of backward() & optimizer
I am trying to learn PyTorch by reimplementing a working TF PPO model.
Using openAI’s SpinningUp tutorials, I simplified and organized their PPO implementation in TF.
I then tried implementing the same exact model using PyTorch, though I’m having trouble with the backprop. I’ve tested each forward, and walked through each step of the PPO algo with both the TF and Pytorch implementations, and they yield the exact same values. But once I start updating the weights, this is where the two diverge.
The full note books are hosted with Google’s colab, and can be found here:
These two should be exact.
Code Snippet
Here is the relevant snippet from both:
TF
# Actor with tf.variable_scope('pi'): # mlp is a function that simply generates a dense forward network logits = mlp(obs_ph, hidden_sizes=hidden_sizes+[act_dim], activation=tf.nn.relu, output_activation=None) pi = tf.squeeze(tf.multinomial(logits, num_samples=1), axis=1) logp = tf.reduce_sum(tf.one_hot(act_ph, depth=act_dim) * tf.nn.log_softmax(logits), axis=1) logp_pi = tf.reduce_sum(tf.one_hot(pi, depth=act_dim) * tf.nn.log_softmax(logits), axis=1) # PPO objectives ratio = tf.exp(logp  logp_old_ph) clipped_adv = tf.where(adv_ph > 0, (1 + clip_ratio) * adv_ph, (1  clip_ratio) * adv_ph) pi_loss = tf.reduce_mean(tf.minimum(ratio * adv_ph, clipped_adv)) train_pi = tf.train.AdamOptimizer(learning_rate=pi_lr).minimize(pi_loss) # Critic with tf.variable_scope('v'): v = tf.squeeze(mlp(obs_ph, hidden_sizes=hidden_sizes+[1], activation=tf.tanh, output_activation=None), axis=1) v_loss = tf.reduce_mean((ret_ph  v) ** 2) train_v = tf.train.AdamOptimizer(learning_rate=v_lr).minimize(v_loss) def update(feed_ph, sess, train_pi_iters=80, train_v_iters=8): inputs = {k:v for k,v in zip(feed_ph, buf.get())} # Policy gradient step for i in range(train_pi_iters): _ = sess.run(train_pi, feed_dict=inputs) # Value function learning for i in range(train_v_iters): _ = sess.run(train_v, feed_dict=inputs)
PyTorch
actor = nn.Sequential( nn.Linear(self.obs_dim, h_dim), # input nn.ReLU(), nn.Linear(h_dim, h_dim), # hidden 1 nn.ReLU(), nn.Linear(h_dim, h_dim), # hidden 2 nn.ReLU(), nn.Linear(h_dim, self.act_dim) # output ) critic = nn.Sequential( nn.Linear(self.obs_dim, h_dim), # input nn.Tanh(), nn.Linear(h_dim, h_dim), # hidden 1 nn.Tanh(), nn.Linear(h_dim, h_dim), # hidden 2 nn.Tanh(), nn.Linear(h_dim, 1) # output ) a_op = t.optim.Adam(self.actor.parameters(), lr = pi_lr) c_op = t.optim.Adam(self.critic.parameters(), lr = v_lr) def update(self): buf = {k:t.from_numpy(v).float() for k,v in zip(['obs', 'acts', 'advs', 'rets', 'logps'], self.buf.get())} # Policy gradient step for i in range(train_pi_iters): logits = actor(buf['obs']) logps = t.sum(one_hot(buf['acts'].long(), act_dim) * log_softmax(logits), 1) ratio = t.exp(logps  buf['logps']) clipped_adv = t.where(buf['advs'] > 0, (1 + clip_ratio) * buf['advs'], (1  clip_ratio) * buf['advs']) pi_loss = t.mean(t.min(ratio * buf['advs'], clipped_adv)) * 1 a_op.zero_grad() pi_loss.backward() a_op.step() # Value function learning for i in range(self.train_v_iters): v = t.squeeze(critic(buf['obs']), dim=1) v_loss = t.mean((buf['rets']  v)** 2) c_op.zero_grad() v_loss.backward() c_op.step()
I’ve walked through each operation, and everything is exact. But once the zero_grad>backward>step loop is called, the two models begin to diverge.
How can I replicate the TF code here?
Thank you so so much!

Concat tensors in PyTorch
I have a tensor called
data
of the shape[128, 4, 150, 150]
where 128 is the batch size, 4 is the number of channels, and the last 2 dimensions are height and width. I have another tensor calledfake
of the shape[128, 1, 150, 150]
.I want to drop the last
list/array
from the 2nd dimension ofdata
; the shape of data would now be[128, 3, 150, 150]
; and concatenate it withfake
giving the output dimension of the concatenation as[128, 4, 150, 150]
.Basically, in other words, I want to concatenate the first 3 dimensions of
data
withfake
to give a 4dimensional tensor.I am using PyTorch and came across the functions
torch.cat()
andtorch.stack()
Here is a sample code I've written:
fake_combined = [] for j in range(batch_size): fake_combined.append(torch.stack((data[j][0].to(device), data[j][1].to(device), data[j][2].to(device), fake[j][0].to(device)))) fake_combined = torch.tensor(fake_combined, dtype=torch.float32) fake_combined = fake_combined.to(device)
But I am getting an error in the line:
fake_combined = torch.tensor(fake_combined, dtype=torch.float32)
The error is:
ValueError: only one element tensors can be converted to Python scalars
Also, if I print the shape of
fake_combined
, I get the output as[128,]
instead of[128, 4, 150, 150]
And when I print the shape of
fake_combined[0]
, I get the output as[4, 150, 150]
, which is as expected.So my question is, why am I not able to convert the list to tensor using
torch.tensor()
. Am I missing something? Is there any better way to do what I intend to do?Any help will be appreciated! Thanks!

Backpropagation in Attention Model
I am trying to figure out how to do backpropagation through the scaled dot product attention model. The scaled dot production attention takes Q(Queries),K(Keys),V(Values) as inputs and performs the following operation:
Attention(Q,K,V ) = softmax((Q.transpose(K))/√dk )V
Here √dk is the scaling factor and is a constant.
Here Q,K and V are tensors. I am for now assuming that Q=K=V. So I differentiate the formula (softmax((Q.transpose(Q)))Q) with respect to Q. I think the answer would be:
softmax((Q.transpose(Q))) + Q.derivativeOfSoftmax((Q.transpose(Q))).(2*transpose(Q))
Since I think the derivative of Q.transpose(Q) wrt Q is 2*Q.transpose(Q).
Is this the right approach considering the rules of tensor calculus? If not kindly tell me how to proceed.
One can refer the concept of scaled dot product attention in the given paper: https://arxiv.org/pdf/1706.03762.pdf

How to add Images in a tensorflow.js model and train the model for given images labels
We are using TensorFlow.js to create and train the model. We use tf.fromPixels() function to convert an image into tensor. We want to create a custom model with the below properties:
AddImage( HTML_Image_Element, 'Label'): Add an imageElement with a custom label Train() / fit() : Train this custom model with associated labels Predict(): Predict the images with their associated label, and it will return the predicted response with the attached label of every image. For better understanding let's take an example: Let's say we have three images for prediction i.e: img1, img2, img3 with three labels 'A', 'B' and 'C' respectively. So we want to create and train our model with these images and respective labels like below : When user want to predict 'img1' then it shows the prediction 'A', similarly, for 'img2' predict with 'B' and for 'img3' predict with 'C'
Please suggest to me how can we create and train this model.
This is webpage we used to create a model with images and its associate labels: <apex:page id="PageId" showheader="false"> <head> <title>Image Classifier with TensorFlowJS</title> <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@0.11.2"></script> <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script> </head> <div id="output_field"></div> <img id="imgshow" src="{!$Resource.cat}" crossorigin="anonymous" width="400" height="300" /> <script> async function learnlinear(){ //img data set const imageHTML = document.getElementById('imgshow'); console.log('imageHTML::'+imageHTML.src); //convert to tensor const tensorImg = tf.fromPixels(imageHTML); tensorImg.data().then(async function (stuffTensImg){ console.log('stuffTensImg::'+stuffTensImg.toString()); }); const model = tf.sequential(); model.add(tf.layers.conv2d({ kernelSize: 5, filters: 20, strides: 1, activation: 'relu', inputShape: [imageHTML.height, imageHTML.width, 3], })); model.add(tf.layers.maxPooling2d({ poolSize: [2, 2], strides: [2, 2], })); model.add(tf.layers.flatten()); model.add(tf.layers.dropout(0.2)); // Two output values x and y model.add(tf.layers.dense({ units: 2, activation: 'tanh', })); // Use ADAM optimizer with learning rate of 0.0005 and MSE loss model.compile({ optimizer: tf.train.adam(0.0005), loss: 'meanSquaredError', }); await model.fit(tensorImg, {epochs: 500}); model.predict(tensorImg).print(); } learnlinear(); </script> </apex:page>
we got the following error while running the code snippet: tfjs@0.11.2:1 Uncaught (in promise) Error: Error when checking input: expected conv2d_Conv2D1_input to have 4 dimension(s). but got an array with shape 300,400,3 at new t (tfjs@0.11.2:1) at standardizeInputData (tfjs@0.11.2:1) at t.standardizeUserData (tfjs@0.11.2:1) at t. (tfjs@0.11.2:1) at n (tfjs@0.11.2:1) at Object.next (tfjs@0.11.2:1) at tfjs@0.11.2:1 at new Promise () at __awaiter$15 (tfjs@0.11.2:1) at t.fit (tfjs@0.11.2:1)