Seq2Seq training with already tokenized ID files

I am training a 4-tuple event to event model (for story generation). My training data are two separate text files for encoder and decoder, each line is a 4-tuple integer, e.g. 57, 59, 1, 3.

I do have a separate dictionary file for this.

My question is how do I apply batching in this case?

I am following the architecture in this notebook: https://colab.research.google.com/github/bentrevett/pytorch-seq2seq/blob/master/1%20-%20Sequence%20to%20Sequence%20Learning%20with%20Neural%20Networks.ipynb#scrollTo=qdvhfatmcV83

Thank you so much!

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum