candle icon indicating copy to clipboard operation
candle copied to clipboard

Implement `torch.bucketize`

Open EricLBuehler opened this issue 1 year ago • 2 comments
trafficstars

Hello all,

We are implementing the Idefics 2 model on mistral.rs, but the HF Transformers code here uses torch.bucketize as a critical part of the code.

Is it possible to implement this using the Candle functions? Thank you in advance for any help!

EricLBuehler avatar May 14 '24 22:05 EricLBuehler

@LaurentMazare, is there a way to do this?

EricLBuehler avatar May 16 '24 18:05 EricLBuehler

I have implemented it here:


/// torch.bucketize with right=True
/// Returns a 1d tensor of shape (xs.len(),) on the CPU
fn bucketize_right(xs: &[f64], boundaries: &[f64], device: &Device) -> Result<Tensor> {
    let accum = xs
        .par_iter()
        .map(|x| {
            for (i, bounds) in boundaries.windows(2).enumerate() {
                let (l, r) = (bounds[0], bounds[1]);
                if x > &l && x <= &r {
                    return i as u32;
                }
            }
            (boundaries.len() - 1) as u32
        })
        .collect::<Vec<_>>();
    Tensor::from_vec(accum, (xs.len(),), &Device::Cpu)
}

I would be willing to write a CUDA/Metal kernel implementation for this. Is this something which you would like to see added?

EricLBuehler avatar May 20 '24 21:05 EricLBuehler