seldon-core icon indicating copy to clipboard operation
seldon-core copied to clipboard

Loosen v1 microservice dependencies

Open mwm5945 opened this issue 2 years ago • 47 comments

What this PR does / why we need it: Resolves https://github.com/SeldonIO/seldon-core/issues/4899. Loosens requirements to allow for ML frameworks to define them.

Which issue(s) this PR fixes:

Fixes #4899

Special notes for your reviewer:

mwm5945 avatar Jun 12 '23 15:06 mwm5945

Hi @mwm5945. Thanks for your PR.

I'm waiting for a SeldonIO or todo member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository.

seldondev avatar Jun 12 '23 15:06 seldondev

/test integration

RafalSkolasinski avatar Jun 12 '23 18:06 RafalSkolasinski

Will re-open in a bit--some "enterprise" stuff going on over here :)

mwm5945 avatar Jun 12 '23 20:06 mwm5945

@RafalSkolasinski should be gtg now! thanks!

mwm5945 avatar Jun 13 '23 13:06 mwm5945

/test integration

ukclivecox avatar Jun 17 '23 08:06 ukclivecox

/test notebooks

ukclivecox avatar Jun 17 '23 08:06 ukclivecox

There seem to be some python lint errors:

keyring 23.13.1 requires importlib-metadata>=4.11.4; python_version < "3.12", but you have importlib-metadata 4.2.0 which is incompatible.
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
black \
	--check ./ ../testing/scripts \
	--exclude "(proto|seldon_core/proto/|.eggs|.tox)"
would reformat setup.py
Oh no! 💥 💔 💥
1 file would be reformatted, 72 files would be left unchanged.
make: *** [Makefile:113: lint] Error 1

ukclivecox avatar Jun 17 '23 12:06 ukclivecox

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@igor-shai @cliveseldon should be fixed!

mwm5945 avatar Jun 20 '23 13:06 mwm5945

/test integration

adriangonz avatar Jun 21 '23 08:06 adriangonz

/test notebooks

adriangonz avatar Jun 21 '23 08:06 adriangonz

@adriangonz I don't have access to see why the above tests failed, any ideas?

mwm5945 avatar Jun 21 '23 14:06 mwm5945

It seems like flaky integration errors. I'll re-trigger the tests @mwm5945 .

adriangonz avatar Jun 22 '23 16:06 adriangonz

/retest

adriangonz avatar Jun 22 '23 16:06 adriangonz

/test integration

adriangonz avatar Jun 23 '23 13:06 adriangonz

@adriangonz anything i can do on my end?

mwm5945 avatar Jun 27 '23 19:06 mwm5945

It's unclear TBH - integration tests are not showing the logs with the failure for some reason. Most likely, it's something strange with our CI. I'll run them locally and get back to you.

In the meantime, could you rebase from master? There's a recent PR which removes flatbuffer completely from the seldon_core Python package.

adriangonz avatar Jun 28 '23 12:06 adriangonz

Just had a quick look locally @mwm5945, and the issue seems to be that protobuf is getting bumped to 4.x+, which causes the following error message:

Traceback (most recent call last):
  File "./conda_env_create.py", line 13, in <module>
    from seldon_core.microservice import PARAMETERS_ENV_NAME, parse_parameters
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/microservice.py", line 15, in <module>
    from seldon_core import wrapper as seldon_microservice
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/wrapper.py", line 10, in <module>
    import seldon_core.seldon_methods
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/seldon_methods.py", line 21, in <module>
    from seldon_core.proto import prediction_pb2
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/proto/prediction_pb2.py", line 19, in <module>
    from seldon_core.proto.tensorflow.core.framework import tensor_pb2 as tensorflow_dot_core_dot_framework_dot_tensor__pb2
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/proto/tensorflow/core/framework/tensor_pb2.py", line 16, in <module>
    from . import resource_handle_pb2 as tensorflow_dot_core_dot_framework_dot_resource__handle__pb2
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/proto/tensorflow/core/framework/resource_handle_pb2.py", line 16, in <module>
    from . import tensor_shape_pb2 as tensorflow_dot_core_dot_framework_dot_tensor__shape__pb2
  File "/opt/conda/lib/python3.8/site-packages/seldon_core/proto/tensorflow/core/framework/tensor_shape_pb2.py", line 36, in <module>
    _descriptor.FieldDescriptor(
  File "/opt/conda/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 561, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

To workaround it, it's best to constrain protobuf to >=3.20.2,<4.0.0 . @mwm5945 once you rebase from master and add that constrain, we can try to re-trigger the tests.

adriangonz avatar Jun 28 '23 13:06 adriangonz

hmm...that should be covered here, no? https://github.com/SeldonIO/seldon-core/blob/1c539262b044d73a507d640929fd7795b94c8d2c/python/setup.py#L8

mwm5945 avatar Jun 28 '23 14:06 mwm5945

The tensorflow extra is only used on certain cases though. If not used, it will just use the default protobuf requirement from the main set of deps (which only says protobuf <= 5.0.0).

The wider problem with that is that the generated protobuf code is linked to a major version of protobuf. That is, protobuf classes generated with protobuf==4.x won't work with protobuf==3.x and viceversa. Therefore, the same package wouldn't work for both pip install seldon_core and pip install seldon_core[tensorflow].

So what I would do is move that protobuf>=3.20.2,<4.0.0 line out of the tensorflow extra and into the main set of dependencies.

adriangonz avatar Jun 28 '23 15:06 adriangonz

problem there is that it creates conflicts with other sets of dependencies, such as RapisAI/cuML

mwm5945 avatar Jun 28 '23 16:06 mwm5945

Right, I see... one potential workaround is to ensure the "official" Docker images do use protobuf==3.20.3, while keeping the seldon_core package itself unconstrained. This can be as simple as a pip install protobuf==3.20.3 on the Dockerfile or having an actual lockfile.

Having said that though, it's unclear how that would work when seldon_core gets installed with protobuf==4.x on your own environment? As far as I can see you'd get the same error as the one shown in https://github.com/SeldonIO/seldon-core/pull/4911#issuecomment-1611428928

adriangonz avatar Jun 28 '23 16:06 adriangonz

BTW: Do you know how cuml introduces the incompatibility with protobuf==3.20.3? I was having look at their reqs but I can't see protobuf listed there.

adriangonz avatar Jun 28 '23 16:06 adriangonz

cuml/libcuml requires cudf, requires protobuf 4.x

mwm5945 avatar Jun 29 '23 14:06 mwm5945

Mmmh that's annoying - in that case, what do you think about the workaround suggested in https://github.com/SeldonIO/seldon-core/pull/4911#issuecomment-1611752305?

adriangonz avatar Jun 29 '23 16:06 adriangonz

very haha. Im actually wondering why TF doesn't declare a version of protobuf in their requirements list....seems like a hard requirement that should be defined on their side--which would solve the issue here because Pip should resolve the version on our behalf....

the workaround is certainly less than elegant.... im by no means a python expert, is there something we can do like how importlib is installed here?

mwm5945 avatar Jun 29 '23 17:06 mwm5945

As far as I'm aware, the main problem is that the protobuf stubs generated (which are statically generated and ship with the Python package - under the seldon_core.proto module) are dependent on the major version of protobuf. That is, stubs generated with protobuf==3.x will not work with protobuf==4.x and viceversa.

Having said that, I just noticed that tensorflow does list a requirement of protobuf>=3.20.3,<5.0.0dev. Therefore, perhaps there's a way to make the statically generated stubs compatible across major version of protobuf?

We'll have a deeper look at the above, and will post an update once we confirm if possible.

adriangonz avatar Jun 30 '23 08:06 adriangonz

great, thanks! from my perspective it doesn't look like the issue comes from seldon's protos, rather the dependency on TF protos

mwm5945 avatar Jun 30 '23 13:06 mwm5945

Hey @mwm5945,

We've had a deeper look at this one, and it does seem like gRPC stubs generated with protobuf==4.x continue working even if you downgrade later to protobuf==3.20.x. Therefore, the remaining bits for this PR (to make it work with protobuf==4.x) would be:

  1. Bump these dev deps in the python/requirements-dev.txt file (required to build the gRPC stubs with protobuf==4.x): grpcio-tools==1.56.0, mypy-protobuf==3.4.0.
  2. Re-generate the gRPC stubs (with protobuf==4.x installed locally!). This can be done with the make -C python build_apis target.

Once that's done, we can then re-trigger the integration tests to confirm that everything works as expected.

adriangonz avatar Jul 04 '23 14:07 adriangonz

working on this--bit of a delay dealing with policies/corporate proxies :)

mwm5945 avatar Jul 05 '23 15:07 mwm5945