Evan Almloff
Evan Almloff
It looks like https://github.com/huggingface/candle/pull/1370 might solve this issue for the quantized version of llama. You could clear the cache after every request and then keep generating A different approach is...
> @danielclough thanks! @ealmloff thanks! Kalosm looks great, I'll try to use it directly. looks like you use both llm-rs and candle. whats your impression? llm-rs is faster, but it...
> @ealmloff just coming back to say - kalosm is REALLY REALLY great! just integrated into a service flawless. I didn't have a tokio runtime crash due to tokio shutdown...
We still use `postcard`. Postcard doesn't support all of serde's features
`cbor` would be better for binary sizes because it is already pulled in for fullstack
Thank you for letting me know about this bug! I believe this was fixed in [sledgehammer-bindgen](https://github.com/Demonthos/sledgehammer_bindgen) which is a generalized version of this library for any set of instructions. I'm...
This makes a difference for large strings, makes no difference for small strings, and adds some overhead for cache misses See: https://github.com/Demonthos/sledgehammer_test It uses a SIMD accelerated non-cyptographic hash
I would like to implement this either for &'static str only so that we can just use the pointer as a hash or only implement this for attribute and element...
I have some concerns about the binary size impact of typed HTML in WASM. One of the largest pieces of dioxus-web today is just converting a typed enum key in...
For enums, we could actually expand the values to const associated generics with &'static str values which wouldn't have a binary size impact. In: ```rust input { r#type: Type::Email, }...