seldon-core
seldon-core copied to clipboard
response from Seldon-core unable to show Unicode charaters
Describe the bug
We use seldon-core-microservice to run our own Python translation model. The return value of the predict method of our customer python model class wrapper is type of numpy.ndarray
which is converted from a List
of Dict
.
class MyTranslationModel:
...
def predict(self, X, features_names=None, meta=None) -> np.array:
...
# translated_result is a Python List which looks like this: [{"translated_text":"कैसे"}]
return np.array(translated_result)
It is normal when we directly call the predict
method. if we print the returned value, it will look like this:
>>> print(MyTranslationModel.predict(X=...))
>>> [{'translated_text':'कैसे'}]
But when we use seldon-core-microservice to run our model as a service, we always get the response from Seldon-core only showing the raw Unicode code instead of the characters. The response looks like this:
{
"data":{
"names":[],
"ndarray":[{"translated_text":"\u0915\u0948\u0938\u0947"}]
},
"meta":{}
}
To reproduce
- Use any Python Model Class Wrapper as long as the return of the
predict
method contains Unicode characters. - Then use seldon-core-microservice to run the model (at a local machine or inside a Docker container).
- Try to send a prediction request to the service and check the response.
Expected behaviour
We expect the response from the model service can show the characters instead of the raw Unicode code.
Environment
(seems to be) environment independent. (This issue has nothing to do with the seldon deployment part.)
Model Details
See the description above.
Let me share some of my findings here.
I found the root cause may related to Flask jsonify which is widely used in seldon-core(Please refer to this thread about why jsonify causes the en/decode issues). The app.config['JSON_AS_ASCII']
of Flask defaults to True
as documented in the Flask official doc. That's the reason we always get the response from seldon-core with raw code instead of the Unicode characters. To solve the Unicode en/decode issue, we need to set app.config['JSON_AS_ASCII']
to False
. But according to the source code of seldon-core(python), JSON_AS_ASCII
is not an ALLOWED
config option currently. Can I raise a PR for adding this option to FLASK_CONFIGS_ALLOWED
? (I think translation is a very common task in AI/ML, it would be really helpful if seldon-core can directly return the characters.)