Ali Mukadam
Ali Mukadam
I do but I didn't use it here.
This is my isvc definition: ``` apiVersion: serving.kserve.io/v1beta1 kind: InferenceService metadata: name: huggingface-llama3 namespace: kserve-test spec: predictor: model: modelFormat: name: huggingface args: - --model_name=llama3 - --model_id=meta-llama/meta-llama-3-8b-instruct env: - name: HF_TOKEN...
With huggingface backend, it works now. Thanks a lot. I'll also try with a GPU and let you know how I go.
Hi, When using the vLLM backend, I ran into the following error: ``` (RayWorkerWrapper pid=762) ERROR 07-01 11:05:45 worker_base.py:145] Error executing method init_device. This might cause deadlock in distributed execution....
Looks like this is done?
Hi, Do you have a ticket with oci-go-sdk we can use to track?
Did you run `lsmod` to confirm these modules are loaded in the latest images? please also run check the multi-cluster example to confirm it work without having to explicitly install...
does the oke api support taint?