Zoltan Fedor
Zoltan Fedor
Hi @notkriswagner, thanks for the update. We are eagerly waiting... :-)
Do you by chance have the default wsdl caching on? I know I fall for that a few times. Try changing the name of the wsdl file to test whether...
The same error occurs even when no batching is used ``` sudo docker run -it --rm --gpus all \ -v $PWD/models:/project ghcr.io/els-rd/transformer-deploy:0.6.0 \ bash -c "pip3 install \".[GPU]\" && cd...
Use Flask-Caching instead https://github.com/sh4nks/flask-caching
It seems the additional space in the subject is because of reaching the 78 character limit of the subject line where a line break with a space is added, that...
ps, I am using Python 3.5.1
I have a similar need to @cauvery In my case I have some integration tests (through @pytest.mark.parametrize) which are making modifications to a shared object and a fixture which always...
I have observed the same. `flan-t5-xl` model's accuracy is terrible with `fp16`, but it is good with `bf16` and also with `fp32`. But I have seen slightly better `rogueLsum` stats...
I am not seeing any errors on normal verbosity, but let me try to increase that, maybe then something will visible
@marcklingen I ran it again with the langfuse logs set to DEBUG. Lots of logs from LangFuse, no error. Simply when making the `async` call then it makes fewer Langfuse...