boto3
boto3 copied to clipboard
Use iterator for s3 object collection
>>> objects
s3.Bucket.objectsCollection(s3.Bucket(name='my-project'), s3.ObjectSummary)
>>> print(next(objects))
TypeError: 's3.Bucket.objectsCollection' object is not an iterator
https://wiki.python.org/moin/Iterator
It does support iter(objects)
wrapping, e.g.
>>> obj_iter = iter(objects)
>>> obj_iter
<generator object ResourceCollection.__iter__ at 0x7f57efbff660>
But why is this necessary?
The objects are not meant to be consumed directly, instead a filter of some kind is intended to be put on the end:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Bucket.objects
Examples:
>>> objects = s3.Bucket(name='test').objects.filter(Prefix='file0.zip')
>>> for o in objects:
>>> print(o)
s3.ObjectSummary(bucket_name='test', key='file0.zip')
Or all of them:
>>> objects = s3.Bucket(name='test').objects.all()
>>> for o in objects:
>>> print(o)
s3.ObjectSummary(bucket_name='test', key='file0.zip')
s3.ObjectSummary(bucket_name='test', key='file1.zip')
s3.ObjectSummary(bucket_name='test', key='file3.zip')
s3.ObjectSummary(bucket_name='test', key='file4.zip')
Not sure if the latest release already supports this, but it seems like all()
should return an iterable.
objects = s3.Bucket(name='test').objects.all()
next(objects) -> s3.ObjectSummary
The docs, e.g. https://boto3.amazonaws.com/v1/documentation/api/latest/guide/collections.html, explicitly indicate that A collection provides an iterable interface to a group of resources
.
It seems like the distinction between iterable and iterator is important [1] and what the intention is for the collections. If the intention is to be an iterable and not an iterator, it's good to go and close this issue at will.
[1] https://www.geeksforgeeks.org/python-difference-iterable-iterator/
Yeah you should definitely be able to call next. Marking as a feature request.
Not sure if the latest release already supports this, but it seems like
all()
should return an iterable.objects = s3.Bucket(name='test').objects.all() next(objects) -> s3.ObjectSummary
The docs, e.g. https://boto3.amazonaws.com/v1/documentation/api/latest/guide/collections.html, explicitly indicate that
A collection provides an iterable interface to a group of resources
.It seems like the distinction between iterable and iterator is important [1] and what the intention is for the collections. If the intention is to be an iterable and not an iterator, it's good to go and close this issue at will.
[1] https://www.geeksforgeeks.org/python-difference-iterable-iterator/
Until this is natively supported, i think you can do
next(x for x in objects)
reference: https://stackoverflow.com/a/2364277/3679900
Alternatively, you may also use something like:
...
bucket_iter = iter(objects)
next(bucket_iter)
...
This works since all Generators (which are used in boto3
to yield API results) are Iterators and all Iterators are Iterables.
map() should also be able to operate over all() - shouldn't it ?