Prediction for package delivery to summarize packages
I have the problem, that a customer can buy something. Now I want to predict if the customer is buying another things in the next few days. So that you can summarize the packages and delivery not every package individually. Has someone read a similar issue? My data looks like
customerid order article mandatoryDeliveryDate
1 03.05 Shoes 05.05
1 04.05 Paper 05.05
2 10.04 PS5 11.04
2 12.04 Laptop 16.04
3 28.04 Clock 30.04
3 [XXX]
What I want to predict, if customer 3
is buying something in the next few days. Is it possible to predict if the customer 3
is buying something in the next few days taking into account the mandatory delivery date? And can you also give the probablity of these prediction?
Is there also any blog/paper/jupyter notebook or anything else where someone implemented a similar prediction for package delivery.
do you know?
how many words do you know
See also questions close to this topic

Training an ML model on two different datasets before using test data?
So I have the task of using a CNN for facial recognition. So I am using it for the classification of faces to different classes of people, each individual person being a it's own separate class. The training data I am given is very limited  I only have one image for each class. I have 100 classes (so I have 100 images in total, one image of each person). The approach I am using is transfer learning of the GoogLenet architecture. However, instead of just training the googLenet on the images of the people I have been given, I want to first train the googLenet on a separate larger set of different face images, so that by the time I train it on the data I have been given, my model has already learnt the features it needs to be able to classify faces generally. Does this make sense/will this work? Using Matlab, as of now, I have changed the fully connected layer and the classification layer to train it on the Yale Face database, which consists of 15 classes. I achieved a 91% validation accuracy using this database. Now I want to retrain this saved model on my provided data (100 classes with one image each). What would I have to do to this now saved model to be able to train it on this new dataset without losing the features it has learned from training it on the yale database? Do I just change the last fully connected and classification layer again and retrain? Will this be pointless and mean I just lose all of the progress from before? i.e will it make new weights from scratch or will it use the previously learned weights to train even better to my new dataset? Or should I train the model with my training data and the yale database all at once? I have a separate set of test data provided for me which I do not have the labels for, and this is what is used to test the final model on and give me my score/grade. Please help me understand if what I'm saying is viable or if it's nonsense, I'm confused so I would appreciate being pointed in the right direction.

What's the best way to select variable in random forest model?
I am training RF models in R. What is the best way of selecting variables for my models (the datasets were pretty big, each has around 120 variables in total). I know that there is a crossvalidation way of selecting variables for other classification algorithms such as KNN. Is that also a thing or if there exists a similar way for parameter tuning in RF model training?

How would I put my own dataset into this code?
I have been looking at a Tensorflow tutorial for unsupervised learning, and I'd like to put in my own dataset; the code currently uses the MNIST dataset. I know how to create my own datasets in Tensorflow, but I have trouble setting the code used here to my own. I am pretty new to Tensorflow, and the filepath to my dataset in my project is
\data\training
and\data\testval\
# Python ≥3.5 is required import sys assert sys.version_info >= (3, 5) # ScikitLearn ≥0.20 is required import sklearn assert sklearn.__version__ >= "0.20" # TensorFlow ≥2.0preview is required import tensorflow as tf from tensorflow import keras assert tf.__version__ >= "2.0" # Common imports import numpy as np import os (X_train_full, y_train_full), (X_test, y_test) = keras.datasets.fashion_mnist.load_data() X_train_full = X_train_full.astype(np.float32) / 255 X_test = X_test.astype(np.float32) / 255 X_train, X_valid = X_train_full[:5000], X_train_full[5000:] y_train, y_valid = y_train_full[:5000], y_train_full[5000:] def rounded_accuracy(y_true, y_pred): return keras.metrics.binary_accuracy(tf.round(y_true), tf.round(y_pred)) tf.random.set_seed(42) np.random.seed(42) conv_encoder = keras.models.Sequential([ keras.layers.Reshape([28, 28, 1], input_shape=[28, 28]), keras.layers.Conv2D(16, kernel_size=3, padding="SAME", activation="selu"), keras.layers.MaxPool2D(pool_size=2), keras.layers.Conv2D(32, kernel_size=3, padding="SAME", activation="selu"), keras.layers.MaxPool2D(pool_size=2), keras.layers.Conv2D(64, kernel_size=3, padding="SAME", activation="selu"), keras.layers.MaxPool2D(pool_size=2) ]) conv_decoder = keras.models.Sequential([ keras.layers.Conv2DTranspose(32, kernel_size=3, strides=2, padding="VALID", activation="selu", input_shape=[3, 3, 64]), keras.layers.Conv2DTranspose(16, kernel_size=3, strides=2, padding="SAME", activation="selu"), keras.layers.Conv2DTranspose(1, kernel_size=3, strides=2, padding="SAME", activation="sigmoid"), keras.layers.Reshape([28, 28]) ]) conv_ae = keras.models.Sequential([conv_encoder, conv_decoder]) conv_ae.compile(loss="binary_crossentropy", optimizer=keras.optimizers.SGD(lr=1.0), metrics=[rounded_accuracy]) history = conv_ae.fit(X_train, X_train, epochs=5, validation_data=[X_valid, X_valid]) conv_encoder.summary() conv_decoder.summary() conv_ae.save("\models")
Do note that I got this code from another StackOverflow answer.

How to make a chatbot for discord using python
I need advise and/or resources to make a chatbot for discord in python, i have some knowledge of python and the discord api but I know nothing about chat bots or how to implement them in python, can anyone lead me to resources about chatbots and artificial intelligence?

Image Background Remover Using Python
I want to make Image Background Remover Using Python But I do not know how much data It will take and time to reach the accuracy of remove.bg I'm using U2Net Ai models https://github.com/xuebinqin/U2Net/ Some results are same but not every result is as good enough as remove.bg In a rating I would tell my app as 2/5 and remove.bg as 4/5 Please tell me How can I achieve accuracy like remove.bg Any help or suggestions are appreciated. Thanks

how to print all parameters of a keras model
I am trying to print all the 1290 parameters in
dense_1
layer, butmodel.get_weights()[7]
only show 10 parameters. How could I print all the 1290 parameters ofdense_1
layer? What is the difference betweenmodel.get_weights()
andmodel.layer.get_weights()
>model.get_weights()[7] array([2.8552295e04, 4.3254648e03, 1.8752701e04, 2.3482188e03, 3.4848123e04, 7.6121779e04, 2.7494309e06, 1.9068648e03, 6.0777756e04, 1.9550985e03], dtype=float32) >model.summary() Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 26, 26, 32) 320 conv2d_1 (Conv2D) (None, 24, 24, 64) 18496 max_pooling2d (MaxPooling2D (None, 12, 12, 64) 0 ) dropout (Dropout) (None, 12, 12, 64) 0 flatten (Flatten) (None, 9216) 0 dense (Dense) (None, 128) 1179776 dropout_1 (Dropout) (None, 128) 0 dense_1 (Dense) (None, 10) 1290 _________________________________________________________________ ================================================================= Total params: 1,199,882 Trainable params: 1,199,882 Nontrainable params: 0 _________________________________________________________________

How my LSTM model knows about testing data and simply cheats previous values/patterns?
I have EncoderDecoder LSTM model that learns to predict 12 months data in advance, while looking back 12 months. If it helps at all, my dataset has around 10 years in total (120 months). I keep 8 years for training/validation, and 2 years for testing. My understanding is that my model does not have access to the testing data at the training time.
The puzzling thing is that my model predictions are simply a shift of previous points. But how did my model know the actual previous points at the time of prediction? I did not give the monthly values in the testing set to the model! If we say that it simply copies the previous point which you give as input, then I am saying that I am giving it 12 months with completely different values than the ones it predicts (so it does not copy the 12 months I am giving), but the forecasted values are shifts of actual ones (which have never been seen).
Below is an example:
My code source is from here:
Below is my code:
#train/test splitting split_position=int(len(scaled_data)*0.8)# 8 years for training train=scaled_data[0:split_position] test=scaled_data[split_position:] #print(train) print('length of train=',len(train)) #print(test) print('length of test=',len(test)) # split train and test data into yearly train/test sets (3d)[observation,year, month] def split_data_yearly(train, test): # restructure into windows of yearly data train = array(split(train, len(train)/12)) test = array(split(test, len(test)/12)) return train, test # evaluate one or more yearly forecasts against expected values def evaluate_forecasts(actual, predicted): scores = list() # calculate an RMSE score for each day for i in range(actual.shape[1]): # calculate mse mse = mean_squared_error(actual[:, i], predicted[:, i]) # calculate rmse rmse = math.sqrt(mse) # store scores.append(rmse) # calculate overall RMSE s = 0 for row in range(actual.shape[0]): for col in range(actual.shape[1]): s += (actual[row, col]  predicted[row, col])**2 score = math.sqrt(s / (actual.shape[0] * actual.shape[1])) ################plot prediction vs actual############################### predicted=predicted.reshape(predicted.shape[0],predicted.shape[1]) jump=12 inv_scores = list() for i in range(len(predicted)): sample_predicted = predicted[i,:] sample_actual=actual[i,:] #inverse normalization sample_predicted_inv= scaler.inverse_transform(sample_predicted.reshape(1, 1)) sample_actual_inv= scaler.inverse_transform(sample_actual.reshape(1, 1)) #print(sample_actual_inv) #print(data_sd[(split_position+(i*jump)1):(split_position+(i*jump1))+len(sample_actual_inv)]) #inverse differencing s=numpy.array(smoothed).reshape(1,1) sample_actual_inv=sample_actual_inv+s[(split_position+(i*jump)):(split_position+(i*jump))+len(sample_actual_inv)] sample_predicted_inv=sample_predicted_inv+s[(split_position+(i*jump)):(split_position+(i*jump))+len(sample_actual_inv)] months=['August'+str(19+i),'September'+str(19+i),'October'+str(19+i),'November'+str(19+i),'December'+str(19+i),'January'+str(20+i),'February'+str(20+i),'March'+str(20+i),'April'+str(20+i),'May'+str(20+i),'June'+str(20+i),'July'+str(20+i)] pyplot.plot( months,sample_actual_inv,'b',label='Actual') pyplot.plot(months,sample_predicted_inv,'', color="orange",label='Predicted') pyplot.legend() pyplot.xticks(rotation=25) pyplot.title('Encoder Decoder LSTM Prediction', y=1.08) pyplot.show() ################### determine RMSE after inversion ################################ mse = mean_squared_error(sample_actual_inv, sample_predicted_inv) rmse = math.sqrt(mse) inv_scores.append(rmse) return score, scores,inv_scores # summarize scores def summarize_scores(name, score, scores): s_scores = ', '.join(['%.1f' % s for s in scores]) print('%s: [%.3f] %s' % (name, score, s_scores)) # convert history into inputs and outputs def to_supervised(train, n_input, n_out=12): # flatten data data = train.reshape((train.shape[0]*train.shape[1], train.shape[2])) X, y = list(), list() in_start = 0 # step over the entire history one time step at a time for _ in range(len(data)): # define the end of the input sequence in_end = in_start + n_input out_end = in_end + n_out # ensure we have enough data for this instance if out_end <= len(data): X.append(data[in_start:in_end, :]) y.append(data[in_end:out_end, 0]) # move along one time step in_start += 1 return array(X), array(y) # train the model def build_model(train, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) #take portion for validation val_size=12; test_x,test_y=train_x[val_size:], train_y[val_size:] train_x,train_y=train_x[0:val_size],train_y[0:val_size] # define parameters verbose, epochs, batch_size = 1,25, 8 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(LSTM(64, activation='relu', input_shape=(n_timesteps, n_features))) model.add(RepeatVector(n_outputs)) model.add(LSTM(64, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) #sgd = optimizers.SGD(lr=0.004, decay=1e6, momentum=0.9, nesterov=True) model.compile(loss='mse', optimizer='adam') # fit network train_history= model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, validation_data=(test_x, test_y),verbose=verbose) loss = train_history.history['loss'] val_loss = train_history.history['val_loss'] pyplot.plot(loss) pyplot.plot(val_loss) pyplot.legend(['loss', 'val_loss']) pyplot.show() return model # make a forecast def forecast(model, history, n_input): # flatten data data = array(history) data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) # retrieve last observations for input data input_x = data[n_input:, :] # reshape into [1, n_input, n] input_x = input_x.reshape((1, input_x.shape[0], input_x.shape[1])) # forecast the next year yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] return yhat # evaluate a single model def evaluate_model(train, test, n_input): # fit model model = build_model(train, n_input) # history is a list of yearly data history = [x for x in train] # walkforward validation over each year predictions = list() for i in range(len(test)): # predict the year yhat_sequence = forecast(model, history, n_input) # store the predictions predictions.append(yhat_sequence) # get real observation and add to history for predicting the next year history.append(test[i,:]) # evaluate predictions days for each year predictions = array(predictions) score, scores, inv_scores = evaluate_forecasts(test[:, :, 0], predictions) return score, scores,inv_scores # split into train and test train, test = split_data_yearly(train, test) # evaluate model and get scores n_input = 12 score, scores, inv_scores = evaluate_model(train, test, n_input) # summarize scores summarize_scores('lstm', score, scores) print('RMSE score after inversion:',inv_scores) # plot scores months=['July','August','September','October','November','December','January','February','March','April','May','June'] #pyplot.plot(months, scores, marker='o', label='lstm') #pyplot.show()

CNN Model classification prediction result problem
I have a model that distinguishes between cat and dog, trained with mobile net once and efficientnet again. The problem arises when test the model with prediction, where the prediction output with the Mobile Net is correct(0.9 0.1) for cat and (0.1 0.9) for dog.while result in efficientnet (0.9 0.1) for cat and (0.8 0.2) for dog.the change is about 0.1 What i do pls