# inconsistent training/validation loss when using Caffe library to resume training with snapshots

As we know, Caffe supports resuming training when the snapshot is given. An explanation of Caffe's training continuation scheme can be found here. However, I found the training loss and validation loss is inconsistent. I gives the following example to illustrate my point. Suppose, I am training a neural network with maximum iteration 1000, and every 100 training iteration it will keep a snapshot. This is done using the following command:

```
caffe train -solver solver.prototxt
```

where the batch size is selected to be 64, and in solver.prototxt we have:

```
test_iter: 4
max_iter: 1000
snapshot: 100
display: 100
test_interval: 100
```

We select `test_iter=4`

carefully so that it will perform testing on nearly all the validation dataset (there are 284 validation samples, a little larger than 4*64).

This will gives us a list of .caffemodel and .solverstate files. For example, we may have solver_iter_300.solverstate and solver_iter_300.caffemodel. When generating these two files, we can also see the training loss (13.7466) and validation loss (2.9385).

Now, if we use the snapshot solver_iter_300.solverstate to continue training:

```
caffe train -solver solver.prototxt -snapshot solver_iter_300.solverstate
```

We can see the training loss and validation loss are 12.6 and 2.99 respectively. They are different from before. Any ideas? Thanks.