Passing tensorDataset or Dataloader to skorch
I want to apply cross validation in Pytorch using skorch, so I prepared my model and my tensorDataset which returns (image,caption and captions_length) and so it has X and Y, so I'll not be able to set Y in the method
net.fit(dataset)
but when I tried that I got an error :
ValueError: Stratified CV requires explicitly passing a suitable y
Here's part of my code:
start = time.time()
net = NeuralNetClassifier(
decoder, criterion= nn.CrossEntropyLoss,
max_epochs=args.epochs,
lr=args.lr,
optimizer=optim.SGD,
device='cuda', # uncomment this to train with CUDA
)
net.fit(dataset, y=None)
end = time.time()
See also questions close to this topic
 how to take rows that are consider outlier in python

How can I detect and cluster windows event log values without regex? (Python)
I want to write a python code for an anonymized windows event log. Before the anonymization process, I want to write a script that classifies the log fields (values) because I don't want the system to work only with a predefined set of known keys. I want the system to be robust and identify the values even if the keys were changed because the data was processed in some intermediate system. For example, the system will get the windows event logs after being ingested into a SIEM like Qradar and relayed to my system.
Using Regex queries is good enough for values like IP, Emails, etc. What about "Account Name," "Account Domain," and many more values that are to spread to detect them with a regular expression?
I thought about creating a dataset of those values and training a machine learning model that can classify the value, but I felt that there is an efficient way to make it.
Any papers, tools, methods, and ideas will help me a lot!

How to implement early stopping when a neural network attains a certain validation accuracy with Pytorch?
I'd like to stop the training of my model when I hit a given validation accuracy similarly to this question here but instead using PyTorch. Is this possible and if so how is it done?

how to do image rotation for 3D MRA image
I want to rotate the tilted mra image. I downloaded IXI dataset but they had some data like the right one as you can see. I want to make it in right direction like the image which an arrow headed to. Is image registration needed for this task? then, let me know how to do it...thank you...

How can I predict next step? from matlab online Time Series Forecasting example tutorial
I am studying the example "Time Series Forecasting Using Deep Learning" tutorial provided by Matlab below https://uk.mathworks.com/help/deeplearning/ug/timeseriesforecastingusingdeeplearning.html
Right at the bottom of the example it refers to "Update Network State with Observed Values" in this section, "you can update the network state with observed values instead of predicted values. Then Predict on each time step. For each prediction, predict the next time step using the observed value of the previous time step."
But what if I wanted to predict the next month? or tomorrow? (assume the time series was a daily time series). How do I do this?
Like the example shows, for each observed value, it is predicts the next time step. Or am I confused and this example is already doing what I am asking for? I want to predict t+1 right at the end of the chart (predicted line)
I have attached an image below to demonstrate what I am asking for.
I have attached an image below to demonstrate what I am asking for.

What is the best Software Development Methodology for Deep Learning?
Like the common software development methodologies such as waterfall and agile, what is the best methodology for deep learning for face or body recognition? Or is there any other better approach?

How is stereo rectification implemented?
I am trying to recreate a stereo rectification algorithm (without using any libraries).
I have calibrated a stereo camera rig using MATLAB. With MATLAB's builtin stereo rectification algorithm, I obtain usable rectified images.
My current implementation (which uses the same calibration results as the MATLAB algorithm) produces images that are slightly off (corresponding features are not horizontally aligned). For reference, a cropped anaglyph of the undistorted images is given.
Code for rectification of the left image:
load('calibration.mat');%load stereoParams leftImages = imageDatastore('captures\left\'); distortedImage = readimage(leftImages, 1); kL = stereoParams.CameraParameters1.IntrinsicMatrix'; k1 = stereoParams.CameraParameters1.RadialDistortion(1); k2 = stereoParams.CameraParameters1.RadialDistortion(2); translation = stereoParams.TranslationOfCamera2 .* [1 1 1]; %translation vector according to initial left camera frame e1 = translation / norm(translation); e2 = [translation(2), translation(1), 0] / norm(translation(1:2)); e3 = cross(e1, e2); transform = kL * [e1; e2; e3] / kL; rectifiedImage(1 : 1000, 1 : 1000) = 0; for r = 1 : 1000 for c = 1 : 1000 undistortedCoord = eye(3) / kL / transform * [c  1; r  1; 1]; %map the rectified coordinate to its corresponding unrectified coordinate undistortedCoord = undistortedCoord / undistortedCoord(3); %normalize unrectified coordinate with respect to its homogeneous component radius = sqrt(undistortedCoord(1) ^ 2 + undistortedCoord(2) ^ 2); %compute distance from image plane principal point distortedCoord = [undistortedCoord(1:2) * (1 + k1 * radius ^ 2 + k2 * radius ^ 4); 1]; %compute distorted coordinate from lens distortion coefficients imageCoord = kL * distortedCoord; %map the distorted coordinate into the distorted image if (imageCoord(1) > 0 && imageCoord(2) > 0 && imageCoord(1) < size(distortedImage, 2)  1 && imageCoord(2) < size(distortedImage, 1)  1) rectifiedImage(r, c) = double(distortedImage(round(imageCoord(2) + 1), round(imageCoord(1) + 1))) / 255; end end end imwrite(rectifiedImage, 'rectifiedL.png');
Code for rectification of the right image:
load('calibration.mat');%load stereoParams rightImages = imageDatastore('captures\right\'); distortedImage = readimage(rightImages, 1); kL = stereoParams.CameraParameters1.IntrinsicMatrix'; kR = stereoParams.CameraParameters2.IntrinsicMatrix'; k1 = stereoParams.CameraParameters2.RadialDistortion(1); k2 = stereoParams.CameraParameters2.RadialDistortion(2); translation = stereoParams.TranslationOfCamera2 .* [1 1 1]; %translation vector according to initial left camera frame e1 = translation / norm(translation); e2 = [translation(2), translation(1), 0] / norm(translation(1:2)); e3 = cross(e1, e2); transform = kL * [e1; e2; e3] * stereoParams.RotationOfCamera2 / kR; rectifiedImage(1 : 1000, 1 : 1000) = 0; for r = 1 : 1000 for c = 1 : 1000 undistortedCoord = eye(3) / kR / transform * [c  1; r  1; 1]; %map the rectified coordinate to its corresponding unrectified coordinate undistortedCoord = undistortedCoord / undistortedCoord(3); %normalize unrectified coordinate with respect to its homogeneous component radius = sqrt(undistortedCoord(1) ^ 2 + undistortedCoord(2) ^ 2); %compute distance from image plane principal point distortedCoord = [undistortedCoord(1:2) * (1 + k1 * radius ^ 2 + k2 * radius ^ 4); 1]; %compute distorted coordinate from lens distortion coefficients imageCoord = kR * distortedCoord; %map the distorted coordinate into the distorted image if (imageCoord(1) > 0 && imageCoord(2) > 0 && imageCoord(1) < size(distortedImage, 2)  1 && imageCoord(2) < size(distortedImage, 1)  1) rectifiedImage(r, c) = double(distortedImage(round(imageCoord(2) + 1), round(imageCoord(1) + 1))) / 255; end end end imwrite(rectifiedImage, 'rectifiedR.png');
Besides the homographies applied, these blocks of code are identical.
MATLAB's stereo rectification algorithm is based on Bouguet's Algorithm, while mine is based on a slightly different algorithm, so I do not expect identical results, still I believe I should be getting valid rectified images.
I have tried number of variations (negative translation vectors, transposed rotation matrices, different output intrinsic matrices, different orders of matrix multiplication, etc.), but I am unable to produce any valid results. According to the algorithm I am using, I should reverse the order of rotation matrix multiplication, but doing this only shifts the right image vertically, therefore increasing misalignment. I initially though that the images were at slightly different scales, but surely this is not possible when using the same output intrinsic matrix across both images. The translation vector has been elementwise multiplied by [1 1 1] to match the left camera coordinate frame that shows in MATLAB's calibration tool. I have tried other variants, such as [1 1 1], [1 1 1], etc. which can give similar results if certain matrices are transposed, however, none give valid rectification.
When I use the homographies produced internally by MATLAB, I get valid results, what is wrong with my homographies?

How to apply the nonmaximum suppression for the corner detection
Finally I have a working code which detects the corner of the rectangles in an image.But the problem is the code is detecting multiple points at same corner. Now I am trying to introduce nonmaximum suppression in my code but it was not working. I have tried one suggestion previous times but it is also not working. how to carry this nonmaximum suppression properly.
import numpy as np import matplotlib.pyplot as plt import matplotlib.image as im from scipy import ndimage # 1. Before doing any operations convert the image into gray scale image img = im.imread('OD6.jpg') plt.imshow(img) plt.show() # split R=img[:,:,0] G=img[:,:,1] B=img[:,:,2] M,N=R.shape gray_img=np.zeros((M,N), dtype=int); for i in range(M): for j in range(N): gray_img[i, j]=(R[i, j]*0.2989)+(G[i, j]*0.5870)+(B[i, j]*0.114); plt.imshow(gray_img, cmap='gray') plt.show() # 2. Applying sobel filter to find the gradients in x and y direction respectively and remove noise # using gaussian filter with sigma=1 imarr = np.asarray(gray_img, dtype=np.float64) ix = ndimage.sobel(imarr, 0) iy = ndimage.sobel(imarr, 1) ix2 = ix * ix iy2 = iy * iy ixy = ix * iy ix2 = ndimage.gaussian_filter(ix2, sigma=1) iy2 = ndimage.gaussian_filter(iy2, sigma=1) ixy = ndimage.gaussian_filter(ixy, sigma=1) c, l = imarr.shape result = np.zeros((c, l)) r = np.zeros((c, l)) rmax = 0 # initialize the maximum value of harris response for i in range(c): for j in range(l): m = np.array([[ix2[i, j], ixy[i, j]], [ixy[i, j], iy2[i, j]]], dtype=np.float64) r[i, j] = np.linalg.det(m)  0.04 * (np.power(np.trace(m), 2)) if r[i, j] > rmax: rmax = r[i, j] # 3. Applying non maximum supression for i in range(c  1): for j in range(l  1): if r[i, j] > 0.01 * rmax and r[i, j] > r[i1, j1] and r[i, j] > r[i1, j+1]\ and r[i, j] > r[i+1, j1] and r[i, j] > r[i+1, j+1]: result[i, j] = 1 xy_coords = np.flip(np.column_stack(np.where(result==1)), axis=1) print (xy_coords) pc, pr = np.where(result == 1) plt.plot(pr, pc, "b.") plt.imshow(img, 'gray') plt.show()

Object detection on multiple cameras as inputs
Is it possible to run video analysis (using yolo or some other similar algo) on a single system with multiple video inputs. For reference, to analyse multiple cctv footage of a factory on a single system.

Debugging pytorch code in pycharm (Feasibility)
I am trying to run a code in written in python (pytorch code) which when passed as an arguments options trains the Neural network.
if __name__ == "__main__": args = docopt(__doc__) myparams = args["options"] .... /* do work */
Now if we have to run this code, I need to call it from console.
python3 train.py option1 123
etc. But in that case the debug points won't work in pycharm. Can anybody clarify how to debug in this scenario? (If you know the way it would be great if you let me know). 
Training results are different for Classification using Pytorch APIs and Fastai
I have two training python scripts. One using Pytorch's API for classification training and another one is using Fastai. Using Fastai has much better results.
Training outcomes are as follows.
Fastai epoch train_loss valid_loss accuracy time 0 0.205338 2.318084 0.466482 23:02 1 0.182328 0.041315 0.993334 22:51 2 0.112462 0.064061 0.988932 22:47 3 0.052034 0.044727 0.986920 22:45 4 0.178388 0.081247 0.980883 22:45 5 0.009298 0.011817 0.996730 22:44 6 0.004008 0.003211 0.999748 22:43 Using Pytorch Epoch [1/10], train_loss : 31.0000 , val_loss : 1.6594, accuracy: 0.3568 Epoch [2/10], train_loss : 7.0000 , val_loss : 1.7065, accuracy: 0.3723 Epoch [3/10], train_loss : 4.0000 , val_loss : 1.6878, accuracy: 0.3889 Epoch [4/10], train_loss : 3.0000 , val_loss : 1.7054, accuracy: 0.4066 Epoch [5/10], train_loss : 2.0000 , val_loss : 1.7154, accuracy: 0.4106 Epoch [6/10], train_loss : 2.0000 , val_loss : 1.7232, accuracy: 0.4144 Epoch [7/10], train_loss : 2.0000 , val_loss : 1.7125, accuracy: 0.4295 Epoch [8/10], train_loss : 1.0000 , val_loss : 1.7372, accuracy: 0.4343 Epoch [9/10], train_loss : 1.0000 , val_loss : 1.6871, accuracy: 0.4441 Epoch [10/10], train_loss : 1.0000 , val_loss : 1.7384, accuracy: 0.4552
Using Pytorch is not converging. I used the same network (Wideresnet22) and both are trained from scratch without pretrained model.
The network is here.
Training using Pytorch is here.
Using Fastai is as follows.
from fastai.basic_data import DataBunch from fastai.train import Learner from fastai.metrics import accuracy #DataBunch takes data and internall create data loader data = DataBunch.create(train_ds, valid_ds, bs=batch_size, path='./data') #Learner uses Adam as default for learning learner = Learner(data, model, loss_func=F.cross_entropy, metrics=[accuracy]) #Gradient is clipped learner.clip = 0.1 #learner finds its learning rate learner.lr_find() learner.recorder.plot() #Weight decay helps to lower down weight. Learn in https://towardsdatascience.com/ learner.fit_one_cycle(5, 5e3, wd=1e4)
What could be wrong in my training algorithm using Pytorch?

Is NN just Bad at Solving this Simple Linear Problem, or is it because of Bad Training?
I was trying to train a very straightforward (I thought) NN model with PyTorch and skorch, but the bad performance really baffles me, so it would be great if you have any insight into this.
The problem is something like this: there are five objects, A, B, C, D, E, (labeled by their fingerprint, e.g.(0, 0) is A, (0.2, 0.5) is B, etc) each correspond to a number, and the problem is trying to find what number does each correspond to. The training data is a list of "collections" and the corresponding sum. for example: [A, A, A, B, B] == [(0,0), (0,0), (0,0), (0.2,0.5), (0.2, 0.5)] > 15, [B, C, D, E] == [(0.2,0.5), (0.5,0.8), (0.3,0.9), (1,1)] > 30 .... Note that number of object in one collection is not constant
There is no noise or anything, so it's just a linear system that can be solved directly. So I would thought this would be very easy for a NN for find out. I'm actually using this example as a sanity check for a more complicated problem, but was surprised that NN couldn't even solve this.
Now I'm just trying to pinpoint exactly where it went wrong. The model definition seem to be right, the data input is right, is the bad performance due to bad training? or is NN just bad at these things?
here is the model definition:
class NN(nn.Module): def __init__( self, input_dim, num_nodes, num_layers, batchnorm=False, activation=Tanh, ): super(SingleNN, self).__init__() self.get_forces = get_forces self.activation_fn = activation self.model = MLP( n_input_nodes=input_dim, n_layers=num_layers, n_hidden_size=num_nodes, activation=activation, batchnorm=batchnorm, ) def forward(self, batch): if isinstance(batch, list): batch = batch[0] with torch.enable_grad(): fingerprints = batch.fingerprint.float() fingerprints.requires_grad = True #index of the current "collection" in the training list idx = batch.idx sorted_idx = torch.unique_consecutive(idx) o = self.model(fingerprints) total = scatter(o, idx, dim=0)[sorted_idx] return total @property def num_params(self): return sum(p.numel() for p in self.parameters()) class MLP(nn.Module): def __init__( self, n_input_nodes, n_layers, n_hidden_size, activation, batchnorm, n_output_nodes=1, ): super(MLP, self).__init__() if isinstance(n_hidden_size, int): n_hidden_size = [n_hidden_size] * (n_layers) self.n_neurons = [n_input_nodes] + n_hidden_size + [n_output_nodes] self.activation = activation layers = [] for _ in range(n_layers  1): layers.append(nn.Linear(self.n_neurons[_], self.n_neurons[_ + 1])) layers.append(activation()) if batchnorm: layers.append(nn.BatchNorm1d(self.n_neurons[_ + 1])) layers.append(nn.Linear(self.n_neurons[2], self.n_neurons[1])) self.model_net = nn.Sequential(*layers) def forward(self, inputs): return self.model_net(inputs)
and the skorch part is straightforward
model = NN(2, 100, 2) net = NeuralNetRegressor( module=model, ... ) net.fit(train_dataset, None)
For a test run, the dataset looks like the following (16 collections in total):
[[0.7484336 0.5656401] [0. 0. ] [0. 0. ] [0. 0. ]] [[1. 1.] [0. 0.] [0. 0.]] [[0.51311415 0.67012525] [0.51311415 0.67012525] [0. 0. ] [0. 0. ]] [[0.51311415 0.67012525] [0.7484336 0.5656401 ] [0. 0. ]] [[0.51311415 0.67012525] [1. 1. ] [0. 0. ] [0. 0. ]] [[0.51311415 0.67012525] [0.51311415 0.67012525] [0. 0. ] [0. 0. ] [0. 0. ] [0. 0. ] [0. 0. ] [0. 0. ]] [[0.51311415 0.67012525] [1. 1. ] [0. 0. ] [0. 0. ] [0. 0. ] [0. 0. ]] ....
with corresponding total: [10, 11, 14, 14, 17, 18, ...]
It's easy to tell what are the objects/how many of them are in one collection just by eyeballing it and the training process looks like:
epoch train_energy_mae train_loss cp dur      1 4.9852 0.5425 + 0.1486 2 16.3659 4.2273 0.0382 3 6.6945 0.7403 0.0025 4 7.9199 1.2694 0.0024 5 12.0389 2.4982 0.0024 6 9.9942 1.8391 0.0024 7 5.6733 0.7528 0.0024 8 5.7007 0.5166 0.0024 9 7.8929 1.0641 0.0024 10 9.2560 1.4663 0.0024 11 8.5545 1.2562 0.0024 12 6.7690 0.7589 0.0024 13 5.3769 0.4806 0.0024 14 5.1117 0.6009 0.0024 15 6.2685 0.8831 0.0024 .... 290 5.1899 0.4750 0.0024 291 5.1899 0.4750 0.0024 292 5.1899 0.4750 0.0024 293 5.1899 0.4750 0.0024 294 5.1899 0.4750 0.0025 295 5.1899 0.4750 0.0025 296 5.1899 0.4750 0.0025 297 5.1899 0.4750 0.0025 298 5.1899 0.4750 0.0025 299 5.1899 0.4750 0.0025 300 5.1899 0.4750 0.0025 301 5.1899 0.4750 0.0024 302 5.1899 0.4750 0.0025 303 5.1899 0.4750 0.0024 304 5.1899 0.4750 0.0024 305 5.1899 0.4750 0.0025 306 5.1899 0.4750 0.0024 307 5.1899 0.4750 0.0025
You can see that it just stopped training after a while. I can confirm that the NN does give different result for different fingerprint, but somehow the final predicted value is just never good enough.
I have tried different NN size, learning rate, batch size, activation function (tanh, relu, etc) and non of them seem to help. Do you have any insight into this? is there anything I did wrong/could try, or is NN just bad at this kind of task?

How to use PyTorch’s DataLoader together with skorch’s GridSearchCV
I am running a PyTorch ANN model (for a classification task) and I am using skorch’s
GridSearchCV
to search for the optimal hyperparameters.When I ran
GridSearchCV
usingn_jobs=1
(ie. doing one hyperparameter combination at a time), it runs really slowly.When I set
n_jobs
to greater than 1, I get a memory blowout error. So I am now trying to see if I could use PyTorch’sDataLoader
to split up the dataset into batches to avoid the memory blowout issue. According to this other PyTorch Forum question (https://discuss.pytorch.org/t/howtouseskorchfordatathatdoesnotfitintomemory/70081/2), it appears we could use SliceDataset. My code for this is as below:# Setting up artifical neural net model class TabularModel(nn.Module): # Initialize parameters embeds, emb_drop, bn_cont and layers def __init__(self, emb_szs, n_cont, out_sz, layers, p=0.5): super().__init__() self.embeds = nn.ModuleList([nn.Embedding(ni, nf) for ni, nf in emb_szs]) self.emb_drop = nn.Dropout(p) self.bn_cont = nn.BatchNorm1d(n_cont) # Create empty list for each layer in the neural net layerlist = [] # Number of all embedded columns for categorical features n_emb = sum((nf for ni, nf in emb_szs)) # Number of inputs for each layer n_in = n_emb + n_cont for i in layers: # Set the linear function for the weights and biases, wX + b layerlist.append(nn.Linear(n_in, i)) # Using ReLu activation function layerlist.append(nn.ReLU(inplace=True)) # Normalised all the activation function output values layerlist.append(nn.BatchNorm1d(i)) # Set some of the normalised activation function output values to zero layerlist.append(nn.Dropout(p)) # Reassign number of inputs for the next layer n_in = i # Append last layer layerlist.append(nn.Linear(layers[1], out_sz)) # Create sequential layers self.layers = nn.Sequential(*layerlist) # Function for feedforward def forward(self, x_cat_cont): x_cat = x_cat_cont[:,0:cat_train.shape[1]].type(torch.int64) x_cont = x_cat_cont[:,cat_train.shape[1]:].type(torch.float32) # Create empty list for embedded categorical features embeddings = [] # Embed categorical features for i, e in enumerate(self.embeds): embeddings.append(e(x_cat[:,i])) # Concatenate embedded categorical features x = torch.cat(embeddings, 1) # Apply dropout rates to categorical features x = self.emb_drop(x) # Batch normalize continuous features x_cont = self.bn_cont(x_cont) # Concatenate categorical and continuous features x = torch.cat([x, x_cont], 1) # Feed categorical and continuous features into neural net layers x = self.layers(x) return x # Use cross entropy loss function since this is a classification problem # Assign class weights to the loss function criterion_skorch = nn.CrossEntropyLoss # Use Adam solver with learning rate 0.001 optimizer_skorch = torch.optim.Adam from skorch import NeuralNetClassifier # Random seed chosen to ensure results are reproducible by using the same initial random weights and biases, # and applying dropout rates to the same random embedded categorical features and neurons in the hidden layers torch.manual_seed(0) net = NeuralNetClassifier(module=TabularModel, module__emb_szs=emb_szs, module__n_cont=con_train.shape[1], module__out_sz=2, module__layers=[30], module__p=0.0, criterion=criterion_skorch, criterion__weight=cls_wgt, optimizer=optimizer_skorch, optimizer__lr=0.001, max_epochs=150, device='cuda' ) from sklearn.model_selection import GridSearchCV param_grid = {'module__layers': [[30], [50,20]], 'module__p': [0.0], 'max_epochs': [150, 175] } from torch.utils.data import TensorDataset, DataLoader from skorch.helper import SliceDataset # cat_con_train and y_train is a PyTorch tensor tsr_ds = TensorDataset(cat_con_train.cpu(), y_train.cpu()) torch.manual_seed(0) # Set random seed for shuffling results to be reproducible d_loader = DataLoader(tsr_ds, batch_size=100000, shuffle=True) d_loader_slice_X = SliceDataset(d_loader, idx=0) d_loader_slice_y = SliceDataset(d_loader, idx=1) models = GridSearchCV(net, param_grid, scoring='roc_auc', n_jobs=2).fit(d_loader_slice_X, d_loader_slice_y)
However, when I ran this code, I get the following error message:
 TypeError Traceback (most recent call last) <ipythoninput47df3fc792ad5e> in <module>() 104 > 105 models = GridSearchCV(net, param_grid, scoring='roc_auc', n_jobs=2).fit(d_loader_slice_X, d_loader_slice_y) 106 6 frames /usr/local/lib/python3.6/distpackages/skorch/helper.py in __getitem__(self, i) 230 def __getitem__(self, i): 231 if isinstance(i, (int, np.integer)): > 232 Xn = self.dataset[self.indices_[i]] 233 Xi = self._select_item(Xn) 234 return self.transform(Xi) TypeError: 'DataLoader' object does not support indexing
How do I fix this? Is there a way to use PyTorch’s
DataLoader
together with skorch’sGridSearchCV
(ie. is there a way to load data in batches into skorch’sGridSearchCV
, to avoid memory blowout issues when I setn_jobs
to greater than 1 inGridSearchCV
)?Many many thanks in advance!

PyTorch & skorch: How to fix my nn.Module to work with skorch's GridSearchCV
Using PyTorch, I have an ANN model (for a classification task) below:
import torch import torch.nn as nn # Setting up artifical neural net model which separates out categorical # from continuous features, so that embedding could be applied to # categorical features class TabularModel(nn.Module): # Initialize parameters embeds, emb_drop, bn_cont and layers def __init__(self, emb_szs, n_cont, out_sz, layers, p=0.5): super().__init__() self.embeds = nn.ModuleList([nn.Embedding(ni, nf) for ni, nf in emb_szs]) self.emb_drop = nn.Dropout(p) self.bn_cont = nn.BatchNorm1d(n_cont) # Create empty list for each layer in the neural net layerlist = [] # Number of all embedded columns for categorical features n_emb = sum((nf for ni, nf in emb_szs)) # Number of inputs for each layer n_in = n_emb + n_cont for i in layers: # Set the linear function for the weights and biases, wX + b layerlist.append(nn.Linear(n_in, i)) # Using ReLu activation function layerlist.append(nn.ReLU(inplace=True)) # Normalised all the activation function output values layerlist.append(nn.BatchNorm1d(i)) # Set some of the normalised activation function output values to zero layerlist.append(nn.Dropout(p)) # Reassign number of inputs for the next layer n_in = i # Append last layer layerlist.append(nn.Linear(layers[1], out_sz)) # Create sequential layers self.layers = nn.Sequential(*layerlist) # Function for feedforward def forward(self, x_cat_cont): x_cat = x_cat_cont[:,0:cat_train.shape[1]].type(torch.int64) x_cont = x_cat_cont[:,cat_train.shape[1]:].type(torch.float32) # Create empty list for embedded categorical features embeddings = [] # Embed categorical features for i, e in enumerate(self.embeds): embeddings.append(e(x_cat[:,i])) # Concatenate embedded categorical features x = torch.cat(embeddings, 1) # Apply dropout rates to categorical features x = self.emb_drop(x) # Batch normalize continuous features x_cont = self.bn_cont(x_cont) # Concatenate categorical and continuous features x = torch.cat([x, x_cont], 1) # Feed categorical and continuous features into neural net layers x = self.layers(x) return x
I am trying to use this model with skorch's GridSearchCV, as below:
from skorch import NeuralNetBinaryClassifier # Random seed chosen to ensure results are reproducible by using the same # initial random weights and biases, and applying dropout rates to the same # random embedded categorical features and neurons in the hidden layers torch.manual_seed(0) net = NeuralNetBinaryClassifier(module=TabularModel, module__emb_szs=emb_szs, module__n_cont=con_train.shape[1], module__out_sz=2, module__layers=[30], module__p=0.0, criterion=nn.CrossEntropyLoss, criterion__weight=cls_wgt, optimizer=torch.optim.Adam, optimizer__lr=0.001, max_epochs=150, device='cuda' ) from sklearn.model_selection import GridSearchCV param_grid = {'module__layers': [[30], [50,20]], 'module__p': [0.0, 0.2, 0.4], 'max_epochs': [150, 175, 200, 225] } models = GridSearchCV(net, param_grid, scoring='roc_auc').fit(cat_con_train.cpu(), y_train.cpu()) models.best_params_
But when I ran the code, I am getting this error message below:
/usr/local/lib/python3.6/distpackages/sklearn/model_selection/_validation.py:536: FitFailedWarning: Estimator fit failed. The score on this traintest partition for these parameters will be set to nan. Details: ValueError: Expected module output to have shape (n,) or (n, 1), got (128, 2) instead FitFailedWarning) /usr/local/lib/python3.6/distpackages/sklearn/model_selection/_validation.py:536: FitFailedWarning: Estimator fit failed. The score on this traintest partition for these parameters will be set to nan. Details: ValueError: Expected module output to have shape (n,) or (n, 1), got (128, 2) instead FitFailedWarning) /usr/local/lib/python3.6/distpackages/sklearn/model_selection/_validation.py:536: FitFailedWarning: Estimator fit failed. The score on this traintest partition for these parameters will be set to nan. Details: ValueError: Expected module output to have shape (n,) or (n, 1), got (128, 2) instead FitFailedWarning) /usr/local/lib/python3.6/distpackages/sklearn/model_selection/_validation.py:536: FitFailedWarning: Estimator fit failed. The score on this traintest partition for these parameters will be set to nan. Details: ValueError: Expected module output to have shape (n,) or (n, 1), got (128, 2) instead FitFailedWarning) /usr/local/lib/python3.6/distpackages/sklearn/model_selection/_validation.py:536: FitFailedWarning: Estimator fit failed. The score on this traintest partition for these parameters will be set to nan. Details: ValueError: Expected module output to have shape (n,) or (n, 1), got (128, 2) instead FitFailedWarning)  ValueError Traceback (most recent call last) <ipythoninput86c408d65e2435> in <module>() 98 > 99 models = GridSearchCV(net, param_grid, scoring='roc_auc').fit(cat_con_train.cpu(), y_train.cpu()) 100 101 models.best_params_ 11 frames /usr/local/lib/python3.6/distpackages/skorch/classifier.py in infer(self, x, **fit_params) 303 raise ValueError( 304 "Expected module output to have shape (n,) or " > 305 "(n, 1), got {} instead".format(tuple(y_infer.shape))) 306 307 y_infer = y_infer.reshape(1) ValueError: Expected module output to have shape (n,) or (n, 1), got (128, 2) instead
I am not sure what is wrong or how to fix this. Any help on this would really be appreciated.
Many thanks in advance!