candle
candle copied to clipboard
Minimalist ML framework for Rust
Adds the aforementioned methods to Device. The `Device::best_device` has the same functionality as `candle_examples::device`, and this PR changes `candle_examples::device` to use `best_device`. `metal_if_available` has been added for parity with `cuda_if_available`.
A while ago there was a [release from segmind](https://blog.segmind.com/introducing-sd-small-and-sd-tiny-stable-diffusion-models/) of two new stable diffusion models which are way smaller and faster to run. I think this would be a great...
Reasoning: 1) We use lots of elementwise operations: [masked_fill in every layer](https://github.com/huggingface/candle/blob/2be9bd211e34333b605695242896903231ab26da/candle-transformers/src/models/llama.rs#L328-L341), [elementwise addition and division](https://github.com/huggingface/candle/blob/main/candle-transformers/src/models/mistral.rs#L275-L283) in our attention implementations. 2) GEMM APIs like cuBLAS's [gemm](https://docs.nvidia.com/cuda/cublas/#cublas-level-3-function-reference) provide alpha and beta...
This unifies the `masked_fill` implementations under Tensor. Addresses #2370 .
Commit fea46cb7 breaks the Image generation for the Metal pipeline. still working commit 8696cf64 ``` git checkout 8696cf64947a7f3b712297426078dcf6ab0d199e Previous HEAD position was fea46cb7 Metal bgemm min changes (#2364) HEAD is...
These are a few utility functions which are often useful. Both implementations do not require operations on the CPU. I plan on following up this PR with one for bitwise...
this is my test code: version is 0.6.0 ``` fn sam() { let result: Result = (|| { let directory = "/home/foliage/model/candle-sam".to_string(); let device = Device::new_cuda(0)?; let mode = "ST".to_string();...
This PR improves compat for older GPUs where the CC is less than 610. Refs #2348.
The equivalent to [torch.Tensor.masked_fill_](https://pytorch.org/docs/stable/generated/torch.Tensor.masked_fill_.html#torch.Tensor.masked_fill_).
I implemented the [tiny NeRF example](https://github.com/bmild/nerf/blob/master/tiny_nerf.ipynb) using `candle` here: https://github.com/laptou/nerfy/blob/fc50dbd61c4012d1f12f556a72474b59a8b3c158/examples/tiny_nerf.rs The example, which is written using TensorFlow, runs fine on my laptop. My `candle` implementation consumes all available memory on...