client
client copied to clipboard
Support very large bucket directories
Right now the DagsHubFilesystem
offers a listdir
method that returns a list. What if I am trying to access a very large bucket directory, I cannot expect that list to be infinitely big.
Example snippet that will time out:
from dagshub.streaming import DagsHubFilesystem
fs = DagsHubFilesystem(".", repo_url="https://dagshub.com/DagsHub-Datasets/radiant-mlhub-dataset")
fs.listdir("s3://radiant-mlhub/bigearthnet")
I propose that the client implements a fs.Walk
that returns a generator with potentially infinite content.