9876691
9876691
Thanks, for reference it's here https://github.com/coreylowman/dfdx/blob/main/src/nn/num_params.rs Looks like a visitor pattern, so if it does visit all of the model (recursively?) then there might be a way to build up...
I would also like this functionality. I'm having trouble sending data to Digital Ocean (DO) Postgres as I think some of the things Replibyte does requires SuperUser access which DO...
I think I've figured out how SQLx do this. They set a variable like so ```rust let accept_invalid_certs = !matches!( options.ssl_mode, PgSslMode::VerifyCa | PgSslMode::VerifyFull ); ``` https://github.com/launchbadge/sqlx/blob/main/sqlx-postgres/src/connection/tls.rs#L50 Now when they...
Is using Onnx runtime an option here? There's a rust binding here https://github.com/microsoft/onnxruntime/tree/main/rust The compute graph is basically formed from a protobuf definition. So using a rust protoc compiler you...
For reference there's some ongoing work in ggml for graph support https://github.com/ggerganov/ggml/pull/108 > These are initial steps towards GPU support via computation graph export. Still figuring out the basics needed....
For me it would be great to switch on the continuous batching via the command line or env car. Then I could use the existing open air end points. Can...
I would also like to see this. In my company we often have confidential data under export control. So the cloud is not an option. Getting a server with a...
I'm also having this issue.
Same for me. Steps to reproduce. ## Inference ```sh model=TheBloke/Llama-2-7B-Chat-GPTQ volume=$PWD/data docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.4 --model-id $model --quantize gptq ``` ## Testing non...
Looks like TGI needs the template to squash the chat history. https://github.com/huggingface/text-generation-inference/blob/main/router/src/infer.rs#L94 Does anyone know how to provide the template? Looks like something like this is needed https://huggingface.co/docs/transformers/chat_templating So I...