candle issues

does candle support nvidia 2080ti on windows 11?

2

i try to use 2080ti to inference, but it raise error. Error: DriverError(CUDA_ERROR_INVALID_PTX, "a PTX JIT compilation failed") when loading is_u32_f32

kulame

BERT Safetensors variable mismatch

2

Hi, I was running the BERT example code and noticed that some of the variables weren't correctly aligning with the current Safetensors obtained via: ``` let repo: ApiRepo = api.model("bert-base-uncased".to_string());...

Christof23

inefficient implementation of gelu for fp16

4

I'm running the dinov2 example on CPU on a Cortex-A76 computer, except I've quantised it to fp16. Looking at its perf profile, a large subset is due to running scalar...

j-baker

Recent revision for contiguous check has problems

14

The following code for the contiguous check in **shape.rs** will trigger problems for the squeezed tensor (n-dim to 1-dim) because of the " if dim > 1" condition (recently added...

guoqingbao

vision dataset support load from img folder

This feature is support load vision dataset from image-foler, like torchvision.datasets.ImageFolder. In my projects, I need to load dataset from image folder for train my model, and I found candle...

wenhaozhao

Support for Nvidia unified memory?

1

All, I saw this morning that Tim Dettmers bitsandbytes python lib uses Nvidia's [Unified Memory](https://developer.nvidia.com/blog/unified-memory-cuda-beginners/) by [default](https://x.com/stasbekman/status/1749968490155696612), see (`csrc/pythonInterface.c:377`). It doesn't look like candle, via cudarc supports this. I'm interested...

jac-cbi

1.58 bit implementation

5

Would it possible to implement 1.58 bit quantization on candle ? It was proposed in the following paper, https://arxiv.org/pdf/2402.17764.pdf The main inspiration behind using 1.58 bit implementation is that you...

okpatil4u

Slow generation compared to transformers + PyTorch

13

I'm running the Llama example on a machine with an Nvidia T4 16GB to compare the performance with HF Transformers + PyTorch. Here's the Python example I'm running: ```python import...

hugoabonizio

Model reuse in TextGeneration examples

10

Hi, I'd like to rig one of the examples into a service, where the service (http) gets a prompt and runs `TextGeneration`. As it stands, `TextGeneration` wants to _own_ model...

jondot

best way to get `dims()` from a model?

For example, to get a particular embedding model's dimensions, *without* doing a test embedding first. At the moment I'm running just a dummy embedding to get to an output and...

jondot

candle
candle copied to clipboard

Metadata

does candle support nvidia 2080ti on windows 11?

BERT Safetensors variable mismatch

inefficient implementation of gelu for fp16

Recent revision for contiguous check has problems

vision dataset support load from img folder

Support for Nvidia unified memory?

1.58 bit implementation

Slow generation compared to transformers + PyTorch

Model reuse in TextGeneration examples

best way to get `dims()` from a model?

← Metadata

Owner

Metadata

candle candle copied to clipboard

Metadata

← Metadata

Owner

Metadata

candle
candle copied to clipboard