Tianle Cai comments

Results 8 comments of


                                            Tianle Cai

[New feature] llama.cpp support

> It looks like a lot of the groundwork is being laid out here with the parallel decoding implementation: [ggerganov/llama.cpp#3228](https://github.com/ggerganov/llama.cpp/pull/3228) Yeah, that's also what I thought. The tree attention implementation...

[New feature] llama.cpp support

> I would like to help with finalizing the support for this. Is there any place where I can contact the group behind this project and ask questions? Hi @kalomaze...

2d filters

Hi Alex, Thanks for your interest! It is a good idea to extend the SGConv to 2d filters, but we didn't try that because we focused on long sequence modeling...

2d filters

@Tylersuard Hi Tyler, we just pushed a standalone SGConv code, and you can have a try now! We tried to run on sequence with 1M tokens with model dimension 256,...

The question about gconv.py

Hi Jiayu, Thanks for your interest! For your questions, first, d_state was used in the original S4 code, and we hadn't fully cleaned it, so please ignore it. Second, the...

Can't get it to run with multi-GPU

Can you please try to update the PyTorch version, this may relate to the issue of incompatibility of nn.DataParallel and nn.ParameterList (e.g., https://github.com/pytorch/pytorch/issues/36035)? Also, please use x = torch.randn(4, 256,...

more examples

Thanks for your interest! The basic function of determining which existing tool to use and invoking ToolMaker to make new tools when needed is implemented by the Dispatcher (Please see...

Seeking details of the final SGConv model used for LRA results

Hi Madhur, sorry for the delay, I've frequently been traveling recently... As for your question, we just added a preview branch containing the whole repo (haven't been completely double-checked, and...