candle icon indicating copy to clipboard operation
candle copied to clipboard

Minimalist ML framework for Rust

Results 407 candle issues
Sort by recently updated
recently updated
newest added

Adds the aforementioned methods to Device. The `Device::best_device` has the same functionality as `candle_examples::device`, and this PR changes `candle_examples::device` to use `best_device`. `metal_if_available` has been added for parity with `cuda_if_available`.

A while ago there was a [release from segmind](https://blog.segmind.com/introducing-sd-small-and-sd-tiny-stable-diffusion-models/) of two new stable diffusion models which are way smaller and faster to run. I think this would be a great...

Reasoning: 1) We use lots of elementwise operations: [masked_fill in every layer](https://github.com/huggingface/candle/blob/2be9bd211e34333b605695242896903231ab26da/candle-transformers/src/models/llama.rs#L328-L341), [elementwise addition and division](https://github.com/huggingface/candle/blob/main/candle-transformers/src/models/mistral.rs#L275-L283) in our attention implementations. 2) GEMM APIs like cuBLAS's [gemm](https://docs.nvidia.com/cuda/cublas/#cublas-level-3-function-reference) provide alpha and beta...

This unifies the `masked_fill` implementations under Tensor. Addresses #2370 .

Commit fea46cb7 breaks the Image generation for the Metal pipeline. still working commit 8696cf64 ``` git checkout 8696cf64947a7f3b712297426078dcf6ab0d199e Previous HEAD position was fea46cb7 Metal bgemm min changes (#2364) HEAD is...

These are a few utility functions which are often useful. Both implementations do not require operations on the CPU. I plan on following up this PR with one for bitwise...

this is my test code: version is 0.6.0 ``` fn sam() { let result: Result = (|| { let directory = "/home/foliage/model/candle-sam".to_string(); let device = Device::new_cuda(0)?; let mode = "ST".to_string();...

This PR improves compat for older GPUs where the CC is less than 610. Refs #2348.

The equivalent to [torch.Tensor.masked_fill_](https://pytorch.org/docs/stable/generated/torch.Tensor.masked_fill_.html#torch.Tensor.masked_fill_).

I implemented the [tiny NeRF example](https://github.com/bmild/nerf/blob/master/tiny_nerf.ipynb) using `candle` here: https://github.com/laptou/nerfy/blob/fc50dbd61c4012d1f12f556a72474b59a8b3c158/examples/tiny_nerf.rs The example, which is written using TensorFlow, runs fine on my laptop. My `candle` implementation consumes all available memory on...