object-store-python
object-store-python copied to clipboard
Allow threads
This should release the GIL and allow use in multiple threads.
I tested with this script:
from contextlib import contextmanager
import math
from time import time
from typing import Iterator
import anyio
import anyio.to_process
import anyio.to_thread
import object_store
@contextmanager
def timeit(name: str) -> Iterator[None]:
start = time()
yield
print(f'{name} took {time() - start:.2f} seconds', flush=True)
def work() -> None:
object_store.ObjectStore('gs://adriangb-public-bucket').get('yellow_tripdata_2024-01 (1).parquet')
async def awork(limiter: anyio.CapacityLimiter) -> None:
await anyio.to_thread.run_sync(work, limiter=limiter)
async def main() -> None:
limiter = anyio.CapacityLimiter(math.inf)
with timeit('main'):
async with anyio.create_task_group() as tg:
for _ in range(32):
tg.start_soon(awork, limiter)
if __name__ == '__main__':
anyio.run(main)
Locally there isn't much difference, I'm IO bound. But on GCP compute that goes from ~15s to ~3s for me.
@roeap quick ping on this
@adriangb - sorry for being MIA for so long. Do you mind rebasing, and I am happy to review / merge then.
@roeap will you push a 0.2.0 release after this one?