Ling Li
Ling Li
FYI, have you tried building ollama from source on your raspberry pi 4?
Rather than serialising the entire object, if it's a file could you not store in an s3 or R2 bucket and serialize the url and just add that to your...
just to chime in here, we are building a custom video conferencing app using AWS chime and svelte. We get this warning when we are using the `` tag as...
hi @rido-min , it looks like the last run failed, 2 weeks ago, could you kindly rerun? https://github.com/microsoft/BotFramework-Emulator/actions/workflows/pack.yaml
I have been taking a look at this and wondering if we could create a new RPC_CMD_LOAD_TENSOR, which passes in the model path/name(or a hash of the model) and tensor/weights...
I think this would be a very interesting feature. There's a nifty little project called fastmcp (https://github.com/jlowin/fast mcp). Think fastapi, but allows your llm to access external functionality, eg you...
so I've actually written my own cli mcp tool using typer and python and it's open sourced here: https://github.com/lingster/mcp-llm only supports claude for now, but PRs or suggestions for improvements...
looking at ktransformers, it seems like they have figured out which layers to load to GPU for improved performance: https://github.com/kvcache-ai/ktransformers/blob/main/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml Looks like this PR will allow you to offload specific...
@matteobruni , yes I agree it should be a toogle or a flag that can be set. I'll see if I can get some time over the next couple of...
Hi, just wondering if there are any updates for this PR and if it will be merged into master? I was about to submit a similar PR to solve this...