flair Nebullvm model optimization integration

Hello, This PR introduces the integration of nebullvm, a model optimization library that enables significant acceleration in model inference. The library has been integrated for transformer models (TransformerDocumentEmbedding and TransformerWordEmbedding) following the code style of the existing Onnx export support.

The PR also features a tutorial on how to leverage the optimization.

Moreover, this PR fixes a bug (#2930) that prevented the correct export of the model to Onnx.

Example Usage

from flair.data import Sentence
from flair.models import SequenceTagger

# Load model
model = SequenceTagger.load("ner-large")

# Define some example sentences
sentences = [
    Sentence("Mars is the fourth planet from the Sun and the second-smallest planet in the Solar System."), 
    Sentence("In the fourth century BCE, Aristotle noted that Mars disappeared behind the Moon during an occultation."),
    Sentence("Liquid water cannot exist on the surface of Mars due to low atmospheric pressure."),
    Sentence("In 2004, Opportunity detected the mineral jarosite."),
]

# Optimize with nebullvm
model.embeddings = model.embeddings.optimize_nebullvm(sentences)

# Inference
sentence = Sentence('George Washington went to Washington.')
model.predict(sentence)

Results

With nebullvm the inference speed of the mode can be significantly improved, with the model used in the example we found the following results:

Machine Type	Baseline (s)	Nebullvm - optimized (s)	Speedup
M1	0.181	0.0358	5.1x
Intel CPU	0.206	0.0953	2,2x
GPU (Tesla T4)	0.0266	0.0129	2.1x

Sep 02 '22 13:09 valeriosofi

Hello @valeriosofi thanks a lot for adding this, many users will surely find this useful! The unit tests are failing though, looks like a deprecated method call. Can you take a look?

Sep 05 '22 10:09 alanakbik

Hello @alanakbik, thanks I managed to fix the problem! Now the unit tests should be ok.

Sep 12 '22 14:09 valeriosofi

Hi @alanakbik, I tested again nebullvm and it works quite well and on my local branch it passes all the tests. As soon as you manage to check it, let me know if you need anything else from our side.

Sep 21 '22 09:09 valeriosofi

I'm getting this error when running import nebullvm:

/opt/conda/envs/tech_ner/lib/python3.9/site-packages/nebullvm/inference_learners/deepsparse.py:32: UserWarning: No deepsparse installation found. Trying to install it...
  warnings.warn(
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
File /opt/conda/envs/tech_ner/lib/python3.9/site-packages/nebullvm/inference_learners/deepsparse.py:22
     21 try:
---> 22     from deepsparse import compile_model, cpu
     23 except ImportError:

ModuleNotFoundError: No module named 'deepsparse'

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
Cell In [8], line 2
      1 # %%
----> 2 import nebullvm

File /opt/conda/envs/tech_ner/lib/python3.9/site-packages/nebullvm/__init__.py:1
----> 1 from nebullvm.api.frontend.torch import optimize_torch_model  # noqa F401
      2 from nebullvm.api.frontend.tf import optimize_tf_model  # noqa F401
      3 from nebullvm.api.frontend.onnx import optimize_onnx_model  # noqa F401

File /opt/conda/envs/tech_ner/lib/python3.9/site-packages/nebullvm/api/frontend/torch.py:27
     19 from nebullvm.base import (
     20     DeepLearningFramework,
     21     ModelParams,
   (...)
     24     QuantizationType,
     25 )
     26 from nebullvm.converters import ONNXConverter
---> 27 from nebullvm.optimizers.pytorch import PytorchBackendOptimizer
     28 from nebullvm.transformations.base import MultiStageTransformation
     29 from nebullvm.utils.data import DataManager

File /opt/conda/envs/tech_ner/lib/python3.9/site-packages/nebullvm/optimizers/__init__.py:6
      4 from nebullvm.optimizers.base import BaseOptimizer  # noqa F401
      5 from nebullvm.optimizers.blade_disc import BladeDISCOptimizer  # noqa F401
----> 6 from nebullvm.optimizers.deepsparse import DeepSparseOptimizer  # noqa F401
      7 from nebullvm.optimizers.neural_compressor import (
      8     NeuralCompressorOptimizer,
      9 )  # noqa F401
     10 from nebullvm.optimizers.onnx import ONNXOptimizer  # noqa F401

File /opt/conda/envs/tech_ner/lib/python3.9/site-packages/nebullvm/optimizers/deepsparse.py:11
      9 from nebullvm.config import CONSTRAINED_METRIC_DROP_THS
     10 from nebullvm.converters import ONNXConverter
---> 11 from nebullvm.inference_learners.deepsparse import (
     12     DEEPSPARSE_INFERENCE_LEARNERS,
     13     DeepSparseInferenceLearner,
     14 )
     15 from nebullvm.measure import compute_relative_difference
     16 from nebullvm.optimizers import BaseOptimizer

File /opt/conda/envs/tech_ner/lib/python3.9/site-packages/nebullvm/inference_learners/deepsparse.py:35
     27 if (
     28     os_ != "Darwin"
     29     and get_cpu_arch() != "arm"
     30     and not NO_COMPILER_INSTALLATION
     31 ):
     32     warnings.warn(
     33         "No deepsparse installation found. Trying to install it..."
     34     )
---> 35     install_deepsparse()
     36     from deepsparse import compile_model, cpu
     37 else:

File /opt/conda/envs/tech_ner/lib/python3.9/site-packages/nebullvm/installers/installers.py:188, in install_deepsparse()
    185 python_minor_version = sys.version_info.minor
    187 cmd = ["apt-get", "install", f"python3.{python_minor_version}-venv"]
--> 188 subprocess.run(cmd)
    190 cmd = ["pip3", "install", "deepsparse"]
    191 subprocess.run(cmd)

File /opt/conda/envs/tech_ner/lib/python3.9/subprocess.py:505, in run(input, capture_output, timeout, check, *popenargs, **kwargs)
    502     kwargs['stdout'] = PIPE
    503     kwargs['stderr'] = PIPE
--> 505 with Popen(*popenargs, **kwargs) as process:
    506     try:
    507         stdout, stderr = process.communicate(input, timeout=timeout)

File /opt/conda/envs/tech_ner/lib/python3.9/subprocess.py:951, in Popen.__init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, user, group, extra_groups, encoding, errors, text, umask)
    947         if self.text_mode:
    948             self.stderr = io.TextIOWrapper(self.stderr,
    949                     encoding=encoding, errors=errors)
--> 951     self._execute_child(args, executable, preexec_fn, close_fds,
    952                         pass_fds, cwd, env,
    953                         startupinfo, creationflags, shell,
    954                         p2cread, p2cwrite,
    955                         c2pread, c2pwrite,
    956                         errread, errwrite,
    957                         restore_signals,
    958                         gid, gids, uid, umask,
    959                         start_new_session)
    960 except:
    961     # Cleanup if the child failed starting.
    962     for f in filter(None, (self.stdin, self.stdout, self.stderr)):

File /opt/conda/envs/tech_ner/lib/python3.9/subprocess.py:1821, in Popen._execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, gid, gids, uid, umask, start_new_session)
   1819     if errno_num != 0:
   1820         err_msg = os.strerror(errno_num)
-> 1821     raise child_exception_type(errno_num, err_msg, err_filename)
   1822 raise child_exception_type(err_msg)

FileNotFoundError: [Errno 2] No such file or directory: 'apt-get'

Oct 20 '22 14:10 klimentij

Hi @klimentij, it looks like you haven't apt-get on your machine! What OS are you using? Does it work if you try to type apt-get update on the terminal?

Oct 20 '22 15:10 valeriosofi

@valeriosofi Amazon Linux 2, so no apt-get. I'd expect it to use a compatible package manager automatically..

Oct 20 '22 16:10 klimentij

@klimentij Yep we will definitely fix this in the next nebullvm release, thanks for the report ;)

Oct 20 '22 19:10 valeriosofi

Hello @valeriosofi thanks for submitting this PR.

I ran the above code and it proceeded to immediately install a bunch of libraries on my system - without asking for permission! It installed (at least): Onnx runtime, OpenVino, TensorRT, tvm, deepsparse, neural-compressor. Some of the libraries did not install correctly, I think. You're using subprocess to run these install commands.

I find this very problematic from a security perspective.

After installing all these libraries for 30 minutes, the code failed with:

[ WARNING ]  No optimized model has been created. This is likely due to a bug in Nebullvm. Please open an issue and report in details your use case.
Traceback (most recent call last):
  File "/home/alan/PycharmProjects/flair/local_nebulum.py", line 25, in <module>
    model.predict(sentence)
  File "/home/alan/PycharmProjects/flair/flair/models/sequence_tagger_model.py", line 480, in predict
    sentence_tensor, lengths = self._prepare_tensors(batch)
  File "/home/alan/PycharmProjects/flair/flair/models/sequence_tagger_model.py", line 284, in _prepare_tensors
    self.embeddings.embed(sentences)
  File "/home/alan/PycharmProjects/flair/flair/embeddings/base.py", line 47, in embed
    self._add_embeddings_internal(data_points)
  File "/home/alan/PycharmProjects/flair/flair/embeddings/transformer.py", line 543, in _add_embeddings_internal
    embeddings = self._forward_tensors(tensors)
  File "/home/alan/PycharmProjects/flair/flair/embeddings/transformer.py", line 778, in _forward_tensors
    return {"token_embeddings": self.model(*tensors.values())[0]}
TypeError: 'NoneType' object is not callable

Unfortunately, I cannot merge this PR as it seems to not be working, and runs a bunch of auto-installers that completely bloated my system without asking me for permission. Happy to discuss here or via mail.

Oct 26 '22 08:10 alanakbik

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Mar 18 '23 07:03 stale[bot]