torchtext has renamed and extended the DataLoader objects from PyTorch and torchvision. In essence, it does the same three jobs:
- Batching the data
- Shuffling the data
- Loading the data in parallel using multiprocessing workers
This batch loading of data enables us to process a dataset that's much larger than the GPU RAM. Iterators extend and specialize the DataLoader for NLP/text processing applications.
We will use both Iterator and its cousin, BucketIterator, here:
from torchtext.data import Iterator, BucketIterator