api-inference-community icon indicating copy to clipboard operation
api-inference-community copied to clipboard

Insert and remove from sys path in generic pipelines

Open nateraw opened this issue 4 years ago • 2 comments

Currently in generic pipeline we simply sys.path.append the path to the snapshot repo. This is fine if running in a docker container once, but for development it can be a bit of a nightmare, especially if you're playing with multiple different repos that have implemented generic pipelines. Since we appended, you'll get previously loaded pipelines instead of the one you expect.

I suggest we do what torch.hub does, and instead sys.path.insert(0, repo_dir), import the module, and then sys.path.remove(repo_dir).

Something like:

import sys
import json
from pathlib import Path
from huggingface_hub import snapshot_download

PIPELINE_FILE = 'pipeline.py'
CONFIG_FILE = 'config.json'


# Taken directly from torch.hub
def import_module(name, path):
    import importlib.util
    from importlib.abc import Loader
    spec = importlib.util.spec_from_file_location(name, path)
    module = importlib.util.module_from_spec(spec)
    assert isinstance(spec.loader, Loader)
    spec.loader.exec_module(module)
    return module


def load_pipeline(repo_id, **kwargs):

    if Path(repo_id).is_dir():
        repo_dir = Path(repo_id)
    else:
        repo_dir = Path(snapshot_download(repo_id))

    pipeline_path = repo_dir / PIPELINE_FILE
    sys.path.insert(0, repo_dir)
    module = import_module(PIPELINE_FILE, pipeline_path)
    sys.path.remove(repo_dir)

    return module.Pipeline(repo_dir)

CC @osanseviero

nateraw avatar Sep 03 '21 22:09 nateraw

cc @Narsil

osanseviero avatar Sep 06 '21 09:09 osanseviero

IMO generic is not meant to be used very much. If it's tedious to use it's OK.

AFAIK, generic is just meant to be used as a demo purpose for some external libraries without having to fully implement the pipeline. It is not meant to be a real used path so I don't think we should do anything to optimize for it. If anything, I would remove generic by making implementing a new pipeline more trivial than the other way around.

Messing with the path in general is asking for trouble.

Narsil avatar Sep 06 '21 10:09 Narsil