text
text copied to clipboard
Torchtext datasets not iterable
❓ Questions and Help
Description I did this:
>> train_data, val_data, test_data = Multi30k(split=('train', 'valid', 'test'))
>> next(train_data)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[33], line 1
----> 1 next(train_data)
TypeError: 'ShardingFilterIterDataPipe' object is not an iterator
But looking at the docs here, it should be iterable. I also tried using .__iter__
.
You are doing it right. It's just that the datasets are like Schrödinger's cat, you never know if they are going to be alive and working or not when you need them. And this has been the issue for years now.
Edit: I just looked into your code. You are using it wrong.
Here is the correct usage:
next(iter(train_data))
This will create an iterable. Although it still won't work because as I said, something is wrong with the datasets.