seldon-core icon indicating copy to clipboard operation
seldon-core copied to clipboard

response from Seldon-core unable to show Unicode charaters

Open caozhuozi opened this issue 2 years ago • 1 comments

Describe the bug

We use seldon-core-microservice to run our own Python translation model. The return value of the predict method of our customer python model class wrapper is type of numpy.ndarray which is converted from a List of Dict.

class MyTranslationModel:
    ...

    def predict(self, X, features_names=None, meta=None) -> np.array:
      ...
      # translated_result is a Python List which looks like this: [{"translated_text":"कैसे"}]
      return np.array(translated_result)

It is normal when we directly call the predict method. if we print the returned value, it will look like this:

>>> print(MyTranslationModel.predict(X=...))
>>> [{'translated_text':'कैसे'}]

But when we use seldon-core-microservice to run our model as a service, we always get the response from Seldon-core only showing the raw Unicode code instead of the characters. The response looks like this:

{
   "data":{
      "names":[],
      "ndarray":[{"translated_text":"\u0915\u0948\u0938\u0947"}]
   },
   "meta":{}
}

To reproduce

  1. Use any Python Model Class Wrapper as long as the return of the predict method contains Unicode characters.
  2. Then use seldon-core-microservice to run the model (at a local machine or inside a Docker container).
  3. Try to send a prediction request to the service and check the response.

Expected behaviour

We expect the response from the model service can show the characters instead of the raw Unicode code.

Environment

(seems to be) environment independent. (This issue has nothing to do with the seldon deployment part.)

Model Details

See the description above.

caozhuozi avatar Aug 06 '22 03:08 caozhuozi

Let me share some of my findings here. I found the root cause may related to Flask jsonify which is widely used in seldon-core(Please refer to this thread about why jsonify causes the en/decode issues). The app.config['JSON_AS_ASCII'] of Flask defaults to True as documented in the Flask official doc. That's the reason we always get the response from seldon-core with raw code instead of the Unicode characters. To solve the Unicode en/decode issue, we need to set app.config['JSON_AS_ASCII'] to False. But according to the source code of seldon-core(python), JSON_AS_ASCII is not an ALLOWED config option currently. Can I raise a PR for adding this option to FLASK_CONFIGS_ALLOWED? (I think translation is a very common task in AI/ML, it would be really helpful if seldon-core can directly return the characters.)

caozhuozi avatar Aug 06 '22 03:08 caozhuozi