Max Makarov
Max Makarov
> `nv(cuda)av1enc` is in order for GStreamer 1.26. > > https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6754 It's already merged. How can I use AV1 with NVENC today?
How to make it use all GPUs in my system? I started like this: ``` torchrun --nproc_per_node 8 server.py --ckpt_dir /var/llama/65B --tokenizer_path /var/llama/tokenizer.model ``` But it only uses one GPU:
Yes, example.py uses all GPUs
Could you please give an example of an HTTP request?
This request crashes the server: ```bash curl -X POST http://127.0.0.1:8042/llama/ -H 'Content-Type: application/json' -d '{"prompts":["Hello. How are you?"], "max_gen_len": "256"}' ``` ```bash root@llama:/llama# torchrun --nproc_per_node 8 server.py --ckpt_dir /var/llama/65B --tokenizer_path...
Any updates?
the same question
@kisak-valve Any chance to fix it?
Valve has fixed this bug.
> * Maybe we'll move to Linux It's my dream. Any ideas how difficult it would be to implement?