Tim Dettmers comments

Results 106 comments of


                                            Tim Dettmers

Usage of Sparse Tensors ?

Thank you for your kind words! I did not read the paper, but I just skimmed it, and the main difference between "To Prune or not to Prune" and our...

Usage of Sparse Tensors ?

This is a very interesting idea. I think from a computational perspective this would be a very promising direction — if successful, it would definitely help us to develop and...

Usage of Sparse Tensors ?

One general problem regarding memory saving is that even though your weights are sparse you still get dense outputs if you have relatively dense inputs. This will probably change in...

Usage of Sparse Tensors ?

If you use the GEMM formulation for convolution you could use a very similar code for convolution. So you could use similar code and just add an img2col and col2img...

Usage of Sparse Tensors ?

cuSPARSE performance can be quite weak for certain matrix sizes and sparsity values and easily be slower than dense. Do you have some numbers for this, that is speed/time taken...

Usage of Sparse Tensors ?

The example that you posted has a couple of problems. For benchmarking you need to use CUDA streams for precise timing since kernel executions are asynchronous. You also need to...

I believe this is an issue with printing the wrong epoch number. From the code, it seems it should work right, but the logged epoch number[ starts again from 1](https://github.com/TimDettmers/sparse_learning/blob/master/mnist_cifar/main.py#L264).

LSTM Support

I never tested it for LSTMs. The library itself should work for LSTMs out of the box. However, I am not sure if there might be problems with inducing too...

Stochastic rounding support for 8-bit optimizers

I actually tried this in experiments and it did not help. We are happy to bring this back, the implementation is not that complicated. Right now we are a bit...

Development Roadmap (2024 Q3)

This is an awesome project! Thank you for this. @Ying1123 I am interested in using SGLang for multi-LoRA deployments for a project. The alternative is currently vLLM, but I like...