AJHoeh
AJHoeh
@dyastremsky Thanks for the fast reply! Here is the log: ``` root@51db6547ef04:/opt/tritonserver# tritonserver --model-repository=/models --strict-mode=False --model-control-mode="explicit" --load-model=tner --log-verbose=1 I0622 17:38:36.227140 1325 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/pytorch/libtriton_pytorch.so I0622 17:38:36.497222 1325 libtorch.cc:1381] TRITONBACKEND_Initialize: pytorch...
@Tabrizian I did not compile a python backend stub myself since my understanding of the [nore here](https://github.com/triton-inference-server/python_backend#1-building-custom-python-backend-stub) is that this is only necessary if my conda environments python version differs...
Thanks at both of you!
> ``` > import lmql > from transformers import ( > AutoTokenizer, > ) > > tokenizer_string = "HuggingFaceH4/zephyr-7b-beta" > > lmql_model = lmql.model( > f"local:gpt2", > tokenizer=tokenizer_string,cuda=True > )...