DL4J/ND4J: Can INDArray instance be reused?
I have a model to train on a large data set that does not fit into RAM. So, basically my plan is to slice the data set creating a
DataSet instance with input vectors and associated labels for every chunk. E.g. if I have 1M input vectors/labels I'd split them into 10 chunks each having 100K records.
Then I'd put a chunk into 2
INDArray objects (for inputs and labels), create a
DataSet and call
model.fit() with that data set, repeating this procedure for every chunk and repeating the whole process until say the model's score reaches some value.
My questions are:
1. Do I understand the process correctly?
2. Can the
INDArray instances be reused? Would it be right to allocate them once and then just fill them up with data set chunks over and over again?
You don't have to do any of this. Workspaces already solves your allocation problem: http://deeplearning4j.org/workspaces
Just use the standard datavec -> recordreaderdatasetiterator -> dataset pattern. That already handles minibatches for you.