smart_open icon indicating copy to clipboard operation
smart_open copied to clipboard

Read importlib.metadata to find transport / compressor extensions

Open arthurlm opened this issue 2 years ago • 0 comments

Title

Read importlib.metadata to find transport / compressor extensions.

Motivation

For now, if user install a package that provide new smart_open transport mechanism, it cannot be automatically registered. It would be great if smart_open could be extended using setuptools / entry_points capabilities.

Example use case

Please note: schema is not relevant here.

In package smart_ext_transport I have following file:

# smart_ext_transport/custom_transport.py
SCHEMA = "custom"
...
Current situation

The only way to load extension for now are listed bellow:

Option 1: using register_transport everywhere and every time it might be required :cry:

register_transport("smart_ext_transport.custom_transport")

Option 2: using __init__.py and import statement:

Adding:

# smart_ext_transport/__init__.py
register_transport("smart_ext_transport.custom_transport")

Then every time custom schema might be needed:

import smart_ext_transport
Improved situation with setuptools

In setup.py from smart_ext_transport we may add:

from setuptools import setup

setup(
    name="smart_ext_transport",
    packages=["smart_ext_transport"],
    entry_points={"smart_open_transport": [
        "custom = smart_ext_transport.custom_transport"],
    },
)

Once package is installed, smart_open can automatically load every plugin that is registered using setuptools entry points. User will no longer have to import package to register extension manually.

This is how pytest / setuptools CLI / flake8 / jupyter nb convert works for example.

Handling correct version of importlib.metadata is mainly inspired from pytest codebase.

See my issue about this PR: #691

What it may change for the future ?

Future extension to smart_open may be moved out of core library.

Example use case may be new extensions libraries like:

  • smart_open_zstd to add zstandard compress
  • smart_open_xxx_cloud to add new cloud provider without having to implementing it in core library (like AliCloud).

Core library will only have to maintain base / robust API and no longer have to support all possible URI provider / compression standard.

Checklist

Before you create the PR, please make sure you have:

  • [x] Picked a concise, informative and complete title
  • [x] Clearly explained the motivation behind the PR
  • [x] Linked to any existing issues that your PR will be solving
  • [x] Included tests for any new functionality
  • [x] Checked that all unit tests pass

arthurlm avatar Apr 30 '22 12:04 arthurlm