gluon model shape inconsistency

I am trying gluon model zoo.

import mxnet as mx
from mxnet.gluon.model_zoo import vision
import cv2
import numpy as np

ctx = mx.gpu(6) # successful
net = vision.alexnet(pretrained=True, ctx=ctx)

# preparing input image. 
# You may ignore this process. This just preprocess an image for the net.
# To load input image as shape (batch=1, channel=3, width, height)
im = cv2.imread(‘img.jpg’) # w,h = 4032,3024. rgb color image
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB).astype(float)/255
im = mx.image.color_normalize(im, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) 
im = np.transpose(im, (2,0,1)) # (4032,3024,3) -> (3,4032,3024)
im = im[None,:] # (3,4032,3024) -> (1,3,4032,3024). this means batchsize=1
im = mx.nd.array(im, ctx=ctx)

# run 
r = net(im)

When I run this, an error occurs.

MXNetError: Shape inconsistent, Provided = [4096,9216], inferred shape=(4096,2976000)

Do I have to resize image to a specific size? At manual, gluon needs only minimum sizes of width, height. Should I have to consider maximum size, or fix the input size?

2 answers

  • answered 2018-11-08 19:25 Sergei

    You need to fix input size with 256 to 256 as this was the image size AlexNet network was trained, according to original paper. Usually, you achieve it by resizing smaller axis (width or height) to be 256 and then doing center crop.

    The thing is that when you use neural networks to predict something, you need to prepare your input data in the exact same way as the training data. If you don't do that, in the simplest case the shape mismatch error will occur. In a more complicated case, when shapes matches, but the image is drastically different from what the model was trained in, the result would be most certainly wrong.

  • answered 2018-11-09 05:55 plhn

    When I resize the input image to under 254*254, inference succeeded.

    Maybe mxnet’s pretrained alexnet does not handle images of large size.

    @Sergei’s comment helped. Thank you.