DataProfiler
DataProfiler copied to clipboard
Update requirements so S3 is an optional package to reduce package bloat.
Is your feature request related to a problem? Please describe.
Currently, boto3
is installed as a default package in the DataProfiler.
I suggest adding it as an optional package such that it can be installed only if desired to use s3.
Might be beneficial for things like parque as well due to package size.
Describe the outcome you'd like:
pip install dataprofiler # doesn't install boto3 + req packages for it
pip install 'dataprofiler[s3]' # installs boto3 + req packages for it
Additional context: This can limit the size of docker images or lambda jobs requiring DataProfiler.
Thanks @JGSweets!