Peter Dannemann

Results 46 comments of Peter Dannemann

I do not suggest we make any code changes. Using an anonymous client is a rare use case and I think forcing someone to be explicit about it is OK....

> I do not suggest we make any code changes. Using an anonymous client is a rare use case and I think forcing someone to be explicit about it is...

> What's the down side of always trying the anonymous client when no credential is found? > If that fails with permission issues an error can still be thrown. It...

> > Different behavior than the google.cloud.storage API > > That's reasonable. However I thought the exact goal of this project is to provide simpler and more unified (in other...

@piskvorky if I did a cost estimate would you accept a donation?

I think I'd prefer to add a transport parameter for version ids instead of tacking them onto the filename. Look at `version_id` in [s3.py](https://github.com/RaRe-Technologies/smart_open/blob/develop/smart_open/s3.py) for a similar example.

Thanks for reporting this and providing clear descriptions and solutions! Your possible solution seems like a good start, but how will we be able handle compression formats other than gzip?...

I misunderstood, other compression formats would not have this problem as they are not transparently decompressed by Google. I think we can get away with just creating a `raw_download` option...

Ok @gdmachado's suggestion is probably what we want to do then. If `Blob.content_encoding == 'gzip' and file_extension != '.gz'` then we save state so smart_open knows the file was transcoded...

You can see benchmarks by running `pytest integration-tests/test_s3.py::test_s3_performance`. This test use the default buffer_size for `smart_open.s3.open`. You can probably increase performance substantially by increasing the buffer_size kwarg passed into `smart_open.open`....