aws-sdk-pandas icon indicating copy to clipboard operation
aws-sdk-pandas copied to clipboard

Impossible to set wr.config.distributed

Open kukushking opened this issue 3 years ago • 0 comments

Describe the bug

It's currently impossible to set distributed flag to False using config if ray and modin are installed, e.g:

import awswrangler as wr

wr.config.distributed=False

wr.read_parquet(...)

This happens because ray.init runs on package import thus the only way to disable distributed code is to set an environment variable prior to the import.

Expected behavior

We should provide an ability to disable distributed using wrangler config class.

Possible solution

One possibility to explore is to do lazy ray.init when application code runs a function that requires distributed functionality

AWS DataWrangler version

3.0.0a2

kukushking avatar Aug 17 '22 18:08 kukushking