Keming
Keming
> To load a saved `my_profile.json` in the profiler with working symbols, you need to use `samply load my_profile.json` . > > You're not the first person to run into...
I cannot reproduce this on gitpod. Can you provide your `build.envd` file?
You need to install gpu operator in the cluster.
I don’t have experience of azure aks. You can contact their customer service. On Wed, 23 Aug 2023 at 21:38, nithinkanil1 ***@***.***> wrote: > @kemingy Already installed GPU to a...
No. It’s a standard cuda image. You can try the base cuda image. On Wed, 23 Aug 2023 at 21:47, nithinkanil1 ***@***.***> wrote: > Is this an issue with vllm...
I guess it's related to the version of cuda image?
The main LLM inference code is in https://github.com/tensorchord/modelz-llm/blob/main/src/modelz_llm/model.py. To add a new model, you need to check https://github.com/tensorchord/llmspec/blob/main/llmspec/model_info.py and add the corresponding docker image in this repo.
Check the issue https://github.com/huggingface/transformers/issues/22222
@xieydd I tried this example but it didn't work. I'm not sure how to make it work when remote has the pgvector extension `vector` while local only has pgvecto.rs extension...
This requires https://github.com/tensorchord/pgvecto.rs-enterprise/pull/8 The latency is mainly affected by the network latency. When testing within AWS US-west2a, the latency is about several milliseconds. ### remote ```sql create extension vectors; set...