training vs validation accuracy and loss output error
Can anyone help me interpret these graphs.
I have been training some datasets over SENet for image classification (tensorLayer).
I started with cifar10 and got some satisfying results with it I use data augmentation for all my experiments.
Next i tried training it with Caltech256 and the results were pretty unusual Maybe due to less number of images per class(3050)
Finally i used TinyImagenet with enough training samples per class(500) but the results are still looking odd somewhat similar to caltech256 results
Is that supposed to be a problem with the data??
See also questions close to this topic

Image classification features
I'm new to ML and trying to solve a imagine classification problem . I have a dataset of about 2mil photos from Flickr with 10 labels : music , food , wedding , nature , etc . Unfortunately i don't have the computational power to use all photos , so I decided to use 30k photos for training and 10k photos for testing . I'm willing to reduce the number of classes (my classes have a broad spectrum) keeping the same number of photos if it would improve my classification rate . Witch algoritm would be best for feature extraction? Should i opt for SVM or CNN ? Thank you for your answer.

how to plot BoxWhisker in matplotlib.pyplot
How to plot 2 columns in boxwhisker plot using matplotlib.pyplot?
When I use below syntax to plot 2 columns in dataframe with variable values, I see output which is only 2 values in x axis
plt.boxplot([df['zipcode'],df['price']])
and if I change the syntax to
plt.boxplot(df['zipcode'],df['price'])
, it results in value error
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How can I create ordinal labels in scikitlearn or Pyspark? And how can I train a model on data that has ordinal labels?
I'm working on a problem where I'm trying to bin my target variable into classes. For e.g. $05000, $500110000, $1000115000 etc.
As apparent, the labels have a certain order, that is $05000 class has the foremost rank, followed by $500110000 and $1000115000.
a) Can anyone please tell me how I can induce or maintain order in my target variable labeling in scikit or Pyspark?
b) Upon having the ordinal labels, how should I be doing the model training? Are there any specific things that I should be aware of during modelling or I can just proceed forward like one would in case of nonordinal classification problem.

passing the references in svyglm and computing pvalue with respective to all the reference variables , Error in svyglm.survey.design
I am performing Ttest on the healthcare data using
svyglm
function. In order to compute the p values with respect to all the reference factors(groups) in the data, we need to pass the reference variable as followsa=svyglm(asthma2~relevel(as.factor(county5),ref=2), design chs.stdgenhealth) summary(a)
county variable is a group of five we are making the comparison within the county among the asthma group
While automating and iterating the ref variables by the all the five counties the following error is shown
Here is the code
lapply(1:length(unique(na.omit(chs$county5))), function(i) {po=rbind(data.table(coef(summary(substitute(svyglm(asthma2~relevel(as.factor(county5),ref = i), design = chs.dsgn)))))[,4])return(po)})
Error:
Error in svyglm.survey.design(asthma2 ~ relevel(as.factor(county5), ref = i), : all variables must be in design= argument
How can we pass the ref variable using the external variable i to the function?

GOF logistic regression: Farrington test
I would like to know how I could perform a GOF test for a logistic regression with a low sample size (n=32).
Kuss (2002) highlighted the limitations of different GOF test with sparse data. Consequently, I decided to perform a Farrington test. I tried to perform the test in SAS using a code provided in github, but I got some errors running the macro. Therefore, I decided to run the test in R, but I cannot find a package that allows it.
Does any of you know a package in R that allows performing the Farrington test or a code to do it in SAS?
Reference:
Oliver Kuss. Global goodnessoffit tests in logistic regression with sparse data. Statist. Med. 2002; 21:3789–3801 (DOI: 10.1002/sim.1421)

Generalized hough transform for center of an object calculation based of SIFT points?
Do You have any suggestion for this?
I am following the steps in one research paper to reimplement it for my specific problem. I am not a professional programmer, however, I am struggling a lot, more than one month.When I am reaching to scoring step with generalized Hough transform, I do not get any result and what I am getting is a blank image without finding the object center. What I did includes the following steps:
I define a spatially constrained area for a training image and extract SIFT features within the area. The red point in the center represents the object center in template(training) image.
and this the interest point extracted by SIFT in query image:
Keypoints are matched according to the some conditions: 1)they should be quantized to the same
visual word
and be spatially consistent. So I get the following points after matching conditions:
 I have 15 and 14 points for template and query images, respectively. I send these points along with template image center of object coordinate to generalized hough transform (the code that I found from github). the code is working properly for it default images. However, according to the few points that I am getting by the algorithm, I do not know what I am doing wrong?!
I thought maybe that is because of
theta
calculation, so I changed this line to returnabs
of y and x differences. But it did not help. In line 20 they only consider 90 degrees for binning, Could I ask what is the reason and how can I define a binning according to my problem and range of angles of rotation around the center of an object?  Does binning range affect the center calculation? I really appreciate it of you let me know what I am doing wrong here. 
How to identify issue with image classification model and softmax probability?
I am trying to build an image classification using transfer learning of VGG16 model in keras. I acquired very small data set of 200 images for each class and used 10 images as validation(I know the data set is small but the requirement was for controlled environment). When predicting, the model favors particular class and also probabilities of other classes are almost close to zero (2e35). I cant figure out exactly the issue. some classes contain very different sets of images(example: vehicle class contain: cars,bikes etc). My question is what is wrong with the model, is it low number of dataset or variations of images in a class.

Can I have Custom Vision training data onpremises cloud instead of public cloud?
Can I have Custom Vision training data/images onpremises cloud instead of public cloud?
I have Microsft Azure cloud onpremises. Can I use and configure the same for uploading images onpremises Custom Vision server?
Currently I am using this URL https://customvision.ai

tf.while_loop only makes only one loop
After days of trying to apply a
tf.while_loop
, I still fail to understand how it works (or rather why it does not). The documentations and various questions here on StackOverflow haven't helped so far.The main idea is to train the different columns of a tensor
trueY
separately using awhile_loop
. The problem is that when I trace this code, I see that thewhile_loop
gets called only once.I'd like to dynamically assign names to variables created in the
while_loop
so as to be able to access them outside thewhile_loop
after they have been created (thus the "gen_name" function trying to dynamically generate names for the dense layers created in each loop), and maketf.while_loop
run n times this way.Here is a sample of my code with the issue (not the full code and modified to demonstrate this problem)
................... config['dim_y'] = 10 Xl = tf.placeholder( self.dtype, shape=(batchsize, config['dim_x']) ) Yl = tf.placeholder( self.dtype, shape=(batchsize, config['dim_y']) ) Gl = tf.placeholder( self.dtype, shape=(batchsize, config['dim_g']) ) costl, cost_m, self.cost_b = self.__cost( Xl, Yl, Gl, False ) def __eval_cost( self, A, X, Y, G, reuse ): AGXY = tf.concat( [A, G, X, Y], 1 ) Z, mu_phi3, ls_phi3 = build_nn( AGXY, ...., reuse ) _cost = tf.reduce_sum( ls_phi3, 1 ) _cost += .5 * tf.reduce_sum( tf.pow( mu_phi3, 2 ), 1 ) _cost += .5 * tf.reduce_sum( tf.exp( 2*ls_phi3 ), 1 ) return _cost def __cost( self, trueX, trueY, trueG, reuse ): ........ columns = tf.unstack(trueY, axis=1) AGX = tf.concat( [ AX, G ], 1 ) pre_Y = self.build_nn( AGX, ....., reuse ) index_loop = (tf.constant(0), _cost, _cost_bl) def condition(index, _cost, _cost_supervised_bi_label): return tf.less(index, self.config['dim_y']) def bodylabeled(index, _cost, _cost_bl): def gen_name(var_name): # split eg 'cost/while/strided_slice_5:0' => '5' # split eg 'cost/while/strided_slice:0' => 'slice' iter = var_name.split('/')[1].split(':')[0].split('_')[1] if iter == "slice": return '0phi2y' else: return '{}phi2y'.format(int(iter) % self.config['dim_y']) y_i = tf.gather(columns, index) y = tf.expand_dims( tf.one_hot(tf.to_int32(y_i, name='ToInt32'), depth, dtype=self.dtype ), 0 ) Y = tf.tile( y, [self.config['L'],1,1] ) c = tf.constant(0, name='test') log_pred_Y = tf.layers.dense( pre_Y, 2, name=gen_name(iter[index].name), reuse=reuse ) log_pred_Y = log_pred_Y  tf.reduce_logsumexp( log_pred_Y, 1, keep_dims=True ) _cost += self.__eval_cost_given_axgy( A, X, Y, G, reuse=tf.AUTO_REUSE ) _cost_bl += tf.reduce_sum( tf.multiply( Y, log_pred_Y ), 1 ) return tf.add(index, 1), _cost, _cost_supervised_bi_label _cost, _bl = tf.while_loop(condition, bodylabeled, index_loop, parallel_iterations=1, shape_invariants=(index_loop[0].get_shape(), tf.TensorShape([None, 100]), tf.TensorShape([None, 100])))[1:] op = costl + cost_m + cost_b with tf.Session(config=config) as sess: sess.run( tf.global_variables_initializer() ) sess.run(tf.local_variables_initializer()) for batchl in batches: sess.run( op, feed_dict={Xl:Xl[batchl,:], Yl:Yl[batchl,:].toarray(), Gl:Gl[batchl,:].toarray(), is_training:True } ) for n in tf.get_default_graph().as_graph_def().node: print(n.name)

training a custom estimator in tensorflow
I am new to tensorflow and trying to train a custom CNN estimator with inputs being provided from
TFRecord
files.The
Load_input()
function is supposed to look into DATA_DIR forTFRecords
file and decode it through a call toread_and_decode
function(which is supposed to do the actual decoding of the records), store the information into an instance of _image_object and return it.cnn_model
is where I have defined the CNN architecture. Andgenerate_input_fn
is supposed to create the batches and feed it to theestimator.train
while training.I just have an abstract understanding of the codes, no idea of the internal mechanics which is the primary reason why I am not able to debug.
Here is my code :
import tensorflow as tf import numpy as np import os DATA_DIR = "./TFRecords/train" #path to tfrecords directory TRAINING_SET_SIZE = 3 BATCH_SIZE = 3 IMAGE_SIZE = 224 def _int64_feature(value): return tf.train.Feature(int64_list=tf.train.Int64List(value=value)) def _bytes_feature(value): return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value])) # image object from protobuf class _image_object: def __init__(self): self.image = tf.Variable([], dtype = tf.string) self.height = tf.Variable([], dtype = tf.int64) self.width = tf.Variable([], dtype = tf.int64) self.filename = tf.Variable([], dtype = tf.string) self.label = tf.Variable([], dtype = tf.int32) def read_and_decode(filename_queue): # this module is responsible for extracting the features reader = tf.TFRecordReader() _, serialized_example = reader.read(filename_queue) features = tf.parse_single_example(serialized_example, features = { "image/encoded": tf.FixedLenFeature([], tf.string), "image/height": tf.FixedLenFeature([], tf.int64), "image/width": tf.FixedLenFeature([], tf.int64), "image/filename": tf.FixedLenFeature([], tf.string), "image/class/label": tf.FixedLenFeature([], tf.int64),}) image_encoded = features["image/encoded"] image_raw = tf.image.decode_jpeg(image_encoded, channels=3) image_object = _image_object() image_object.image = tf.image.resize_image_with_crop_or_pad(image_raw, IMAGE_SIZE, IMAGE_SIZE)#resizes and crops image_object.height = features["image/height"] image_object.width = features["image/width"] image_object.filename = features["image/filename"] image_object.label = tf.cast(features["image/class/label"], tf.int64) return image_object def Load_input(): print 'Generating data from tfrecords...' filenames = [os.path.join(DATA_DIR, "train0000%dof00002.tfrecord" % i) for i in xrange(0, 1)] for f in filenames: if not tf.gfile.Exists(f): raise ValueError("Failed to find file: " + f) filename_queue = tf.train.string_input_producer(filenames) print 'decoding queue contents ::{}'.format(filename_queue) image_object = read_and_decode(filename_queue) image = tf.image.per_image_standardization(image_object.image) # image = image_object.image # image = tf.image.adjust_gamma(tf.cast(image_object.image, tf.float32), gamma=1, gain=1) # Scale image to (0, 1) label = image_object.label filename = image_object.filename return image,label,filename def cnn_model(features,labels,mode): print 'creating layers...' #Input layer #inp = tf.reshape(features['x'],[1,28,28,1]) inp = tf.reshape(features,[1,224,224,3]) print 'input shape ::{}'.format(inp.shape) #convolutional layer #1 conv1 = tf.layers.conv2d(inputs=inp,filters=32,kernel_size=[5,5],padding='same',activation=tf.nn.relu) print 'convolution1 shape ::{}'.format(conv1.shape) #pooling Layer pool1=tf.layers.max_pooling2d(inputs=conv1,pool_size=[2,2],strides=2) print 'Pool1 shape ::{}'.format(pool1.shape) #convolutional layer #2 conv2 = tf.layers.conv2d(inputs=pool1,filters=64,kernel_size=[5,5],padding='same',activation=tf.nn.relu) print 'convolution2 shape ::{}'.format(conv2.shape) #pooling layer pool2=tf.layers.max_pooling2d(inputs=conv2,pool_size=[2,2],strides=2) print 'Pool2 shape ::{}'.format(pool2.shape) #dense layer pool2_flat = tf.reshape(pool2,[1,56*56*64]) #dimension = [BATCH_SIZE,HEIGHT*WIDTH*CHANNELS of the last pooled layers] dense = tf.layers.dense(inputs=pool2_flat,units=1024,activation=tf.nn.relu) # units = number of neurons per layer dropout=tf.layers.dropout(inputs=dense,rate=0.4,training = (mode == tf.estimator.ModeKeys.TRAIN)) #Logits Layer logits = tf.layers.dense(inputs=dropout,units=2) #has shape [batch_size, no_of_labels] predictions ={'classes':tf.argmax(input=logits,axis=1),'probabilities':tf.nn.softmax(logits,name='softmax_tensor')} print 'Logits shape ::{}'.format(logits.shape) print 'Labels shape ::{}'.format(labels.shape) #Calculate loss for TRAIN and EVAL mode loss = tf.nn.softmax_cross_entropy_with_logits(labels=labels,logits=logits) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001) train_op = optimizer.minimize(loss=loss,global_step=tf.train.get_global_step()) print 'Layers created...' return tf.estimator.EstimatorSpec(mode=mode,loss=loss,train_op=train_op) def generate_input_fn(image,label,batch_size=BATCH_SIZE): print("Filling queue with images before starting to train. " "This will take a few minutes.") num_preprocess_threads = 1 def _input_fn(): image_placeholder=tf.placeholder(tf.float32,shape=[batch_size,224,224,3]) label_placeholder=tf.placeholder(tf.int64,shape=[batch_size,1]) image_batch, label_batch= tf.train.shuffle_batch( [image_placeholder, label_placeholder], batch_size = batch_size, num_threads = num_preprocess_threads, capacity = 8 * BATCH_SIZE, min_after_dequeue = 4 * BATCH_SIZE) return image_batch, label_batch return _input_fn def main(unused_argv): print 'program started...' image_data, label_data, filename = Load_input() print 'image_data::{} label_data::{}'.format(type(image_data),type(label_data)) estimator = tf.estimator.Estimator(model_fn=cnn_model,model_dir='./') print 'Estimator ready...' tensors_to_log = {'probabilities':'softmax_tensor'} logging_hook = tf.train.LoggingTensorHook(tensors=tensors_to_log,every_n_iter=1) print 'Logs ready...' print 'Starting training...' estimator.train(input_fn=generate_input_fn(image=image_data, label=label_data),steps=2,hooks=[logging_hook]) if __name__=='__main__': tf.app.run() print 'Program ended...'
it gives me the following error :
ValueError: Dimension 0 in both shapes must be equal, but are 9 and 3. Shapes are [9,2] and [3,3]. for 'softmax_cross_entropy_with_logits_sg' (op: 'SoftmaxCross EntropyWithLogits') with input shapes: [9,2], [3,3].
also the layers shapes are as follows :
conv1 output shape :: (9, 224, 224, 32) pool1 shape :: (9, 112, 112, 32) conv2 shape ::(9, 112, 112, 64) pool2 shape :: (9, 56, 56, 64) Logits shape :: (9, 2) Labels shape :: (3, 3)
I don't understand why is the
batch size
9 even if I try to explicitly set it to 3 in the code.Note : If anyone has a better/easier solution please post it. The aim is to use tfrecords to train a custom CNN

Implementing WNGrad in Pytorch?
I'm trying to implement the WNGrad (technically WNAdam, algorithm 4 in the paper) optimizier (WNGrad) in pytorch. I've never implemented an optimizer in pytorch before so I don't know if I've done it correctly (I started from the adam implementation). The optimizer does not make much progress and falls down like I would expect (bj values can only monotonically increase, which happens quickly so no progress is made) but I'm guessing I have a bug. Standard optimizers (Adam, SGD) work fine on the same model I'm trying to optimize.
Does this implementation look correct?
from torch.optim import Optimizer class WNAdam(Optimizer): """Implements WNAdam algorithm. It has been proposed in `WNGrad: Learn the Learning Rate in Gradient Descent`_. Arguments: params (iterable): iterable of parameters to optimize or dicts defining parameter groups lr (float, optional): learning rate (default: 0.1) beta1 (float, optional): exponential smoothing coefficient for gradient. When beta=0 this implements WNGrad. .. _WNGrad\: Learn the Learning Rate in Gradient Descent: https://arxiv.org/abs/1803.02865 """ def __init__(self, params, lr=0.1, beta1=0.9): if not 0.0 <= beta1 < 1.0: raise ValueError("Invalid beta1 parameter: {}".format(beta1)) defaults = dict(lr=lr, beta1=beta1) super().__init__(params, defaults) def step(self, closure=None): """Performs a single optimization step. Arguments: closure (callable, optional): A closure that reevaluates the model and returns the loss. """ loss = None if closure is not None: loss = closure() for group in self.param_groups: for p in group['params']: if p.grad is None: continue grad = p.grad.data state = self.state[p] # State initialization if len(state) == 0: state['step'] = 0 # Exponential moving average of gradient values state['exp_avg'] = torch.zeros_like(p.data) # Learning rate adjustment state['bj'] = 1.0 exp_avg = state['exp_avg'] beta1 = group['beta1'] state['step'] += 1 state['bj'] += (group['lr']**2)/(state['bj'])*grad.pow(2).sum() # update exponential moving average exp_avg.mul_(beta1).add_(1  beta1, grad) bias_correction = 1  beta1 ** state['step'] p.data.sub_(group['lr'] / state['bj'] / bias_correction, exp_avg) return loss

Multi class classification with neural network from scracth
I am trying to implement multiclass classification with activation sigmoid in hidden layer and softmax in output layer. I have implemented binary classification using sigmoid in output layer before but when I change my code to perform multiclass classification, it does not work properly and loss is not reduced during training. is there any error in my code? or have I implemented it incorrectly?
This is the architecture I used: input layer > 3 neuron with sigmoid> 3 neuron with sigmoid> softmax.
And this is my code so far:
# * coding: utf8 * import numpy as np import pandas as pd from sklearn.metrics import confusion_matrix, log_loss import matplotlib.pyplot as plt import load_data as ld X_train, X_test, Y_train, Y_test = ld.load_dataset() input_dim = 4 hidden_dim = 20 output_dim = 3 num_epoch = 10 learning_rate = 0.01 learning_curve = [] model = {} def init(): np.random.seed(1) model['W1'] = np.random.randn(input_dim,hidden_dim) / np.sqrt(hidden_dim) model['W2'] = np.random.randn(hidden_dim, hidden_dim) / np.sqrt(hidden_dim) model['W3'] = np.random.randn(hidden_dim, output_dim) / np.sqrt(hidden_dim) model['b1'] = np.random.randn(1,hidden_dim) model['b2'] = np.random.randn(1,hidden_dim) model['b3'] = np.random.randn(1,output_dim) def sigmoid(z): return 1 / (1 + np.exp(z)) def softmax(z): z = z  np.max(z) exp_scores = np.exp(z) return exp_scores / np.sum(exp_scores, axis=1, keepdims=True) def softmax_derivative(z , y): return np.ones(z.shape) def sigmoid_derivative(z): return z * (1  z) def calculate_error(y, y_pred): temp = [] for i in range(y.shape[0]): temp.append(log_loss(y[i,:], y_pred[i,:], normalize=True)) temp = np.array(temp, dtype=np.float32) temp = temp.reshape([105,1]) return temp def accuracy(y_actual, y_pred): for i in range(len(y_actual)): if (y_pred[i] > 0.5): y_pred[i] = 1 else: y_pred[i] = 0 cm = confusion_matrix(y_actual, y_pred) return (cm[0][0] + cm[1][1])/ len(y_actual) return y_pred def forward_propagation(X_test): W1, W2, W3, b1, b2, b3 = model['W1'], model['W2'], model['W3'], model['b1'], model['b2'], model['b3'] z1 = X_test.dot(W1) + b1 a1 = sigmoid(z1) z2 = a1.dot(W2) + b2 a2 = sigmoid(z2) z3 = a2.dot(W3) + b3 a3 = softmax(z3) return a3 def backward_propagation(X_train, Y_train): for i in range(num_epoch): W1, W2, W3, b1, b2, b3 = model['W1'], model['W2'], model['W3'], model['b1'], model['b2'], model['b3'] z1 = X_train.dot(W1) + b1 a1 = sigmoid(z1) z2 = a1.dot(W2) + b2 a2 = sigmoid(z2) z3 = a2.dot(W3) + b3 a3 = softmax(z3) error_a3 = calculate_error(Y_train, a3) slope_a3 = softmax_derivative(a3, Y_train) delta_a3 = error_a3 * slope_a3 error_a2 = delta_a3.dot(W3.T) slope_a2 = sigmoid_derivative(a2) delta_a2 = error_a2 * slope_a2 error_a1 = delta_a2.dot(W2.T) slope_a1 = sigmoid_derivative(a1) delta_a1 = error_a1 * slope_a1 model['W1'] = W1 + learning_rate * X_train.T.dot(delta_a1) model['W2'] = W2 + learning_rate * (a1.T.dot(delta_a2)) model['W3'] = W3 + learning_rate * (a2.T.dot(delta_a3)) model['b1'] = b1 + np.mean(delta_a1, axis=0, keepdims=True) model['b2'] = b2 + np.mean(delta_a2, axis=0, keepdims=True) model['b3'] = b3 + np.mean(delta_a3, axis=0, keepdims=True) learning_curve.append(np.mean(error_a3)) if(i%1 == 0): print ("epoch : ",i,", error : ",np.mean(error_a3)) init() before_backprop = forward_propagation(X_test) backward_propagation(X_train, Y_train) after_backprop = forward_propagation(X_test) learning_curve = np.array(learning_curve, dtype=np.float32) plt.plot(learning_curve) plt.title('Learning Curve') plt.xlabel('Epoch') plt.ylabel('Loss') plt.legend() plt.show()

MxNet metrics API to compute accuracy for multiclass logistic regression with vector labels
How to use MxNet metrics api to calculate accuracy of the multiclass logistic regression classifier with vector labels? Here is an example for labels:
Class1: [1,0,0,0] Class2: [0,1,0,0] Class3: [0,0,1,0] Class4: [0,0,0,1]
The naive way to use this function would produce wrong result as argmax will squash the model output into an index having max probability value
def evaluate_accuracy(data_iterator, ctx, net): acc = mx.metric.Accuracy() for i, (data, label) in enumerate(data_iterator): data = data.as_in_context(ctx) label = label.as_in_context(ctx) out = net(data) p = nd.argmax(out, axis=1) acc.update(preds=p, labels=label) return acc.get()[1]
My current solution is little hacky:
def evaluate_accuracy(data_iterator, ctx, net): acc = mx.metric.Accuracy() for i, (data, label) in enumerate(data_iterator): data = data.as_in_context(ctx) label = label.as_in_context(ctx) out = net(data) p = nd.argmax(out, axis=1) l = nd.argmax(label, axis=1) acc.update(preds=p, labels=l) return acc.get()[1]

Why variable importance is not reflected in variable actually used in tree construction?
I generated an (unpruned) classification tree on R with the following code:
fit < rpart(train.set$line ~ CountryCode + OrderType + Bon + SupportCode + prev_AnLP + prev_TXLP + prev_ProfLP + prev_EVProfLP + prev_SplLP + Age + Sex + Unknown.Position + Inc + Can + Pre + Mol, data=train.set, control=rpart.control(minsplit=5, cp=0.001), method="class")
printcp(fit) shows:
Variables actually used in tree construction:
Age
CountryCode
SupportCode
OrderType
prev_AnLP
prev_EVProfLP
prev_ProfLP
prev_TXLP
prev_SplLPThose are the same variables I can see at each node in the classification tree, so they are correct. What I do not understand is the result of summary(fit):
Variable importance:
29 prev_EVProfLP
19 prev_AnLP
16 prev_TXLP
15 prev_SplLP
9 prev_ProfLP
7 CountryCode
2 OrderType
1 Pre
1 MolFrom summary(fit) results it seems that variables Pre and Mol are more important than SupportCode and Age, but in the tree Pre and Mol are not used to split the data, while SupportCode and Age are used (just before two leafs, actually... but still used!). Why?