is there any difference between matmul and usual multiplication of tensors

I am confused between the multiplication between two tensors using * and matmul. Below is my code

import torch
torch.manual_seed(7)
features = torch.randn((2, 5))
weights = torch.randn_like(features)

here, i want to multiply weights and features. so, one way to do it is as follows

print(torch.sum(features * weights))

Output:

tensor(-2.6123)

Another way to do is using matmul

print(torch.mm(features,weights.view((5,2))))

but, here output is

tensor([[ 2.8089,  4.6439],
        [-2.3988, -1.9238]])

What i don't understand here is that why matmul and usual multiplication are giving different outputs, when both are same. Am i doing anything wrong here?

Edit: When, i am using feature of shape (1,5) both * and matmul outputs are same. but, its not the same when the shape is (2,5).

1 answer

  • answered 2018-11-08 06:51 Umang Gupta

    When you use *, the multiplication is elementwise, when you use torch.mm it is matrix multiplication.

    Example:

    a = torch.rand(2,5)
    b = torch.rand(2,5)
    result = a*b 
    

    result will be shaped the same as a or b i.e (2,5) whereas considering operation

    result = torch.mm(a,b)
    

    It will give a size mismatch error, as this is proper matrix multiplication (as we study in linear algebra) and a.shape[1] != b.shape[0]. When you apply the view operation in torch.mm you are trying to match the dimensions.

    In the special case of the shape in some particular dimension being 1, it becomes a dot product and hence sum (a*b) is same as mm(a, b.view(5,1))