Sequoia icon indicating copy to clipboard operation
Sequoia copied to clipboard

scalable and robust tree-based speculative decoding algorithm

Results 10 Sequoia issues
Sort by recently updated
recently updated
newest added

Hi, If I understand the tree_search algorithm right, the dynamic programming process should be able to find the optimal number of generated tokens according to the acceptance-rate-vector. Also, given the...

Hi, I was trying to reproduce the numbers in the paper, but with the `demo-config.json`, plus the acceptance vector in the repo or the acceptance vector I tested myself, the...

Sorry for asking a possibly obvious question but it would be better if the documentation makes this clear.

Hi, I remember the support on vLLM was on your TODOs. Have you achieved it now? Was the main challenge in this direction that the batch size > 1 tree...

current code is not compatible with transformers 4.39 + because of changed rotary functions. Fix: copied these functions from transformers==4.37.2

Fixing loading functions to save loading time and space. only the first files in each DS are needed. Addresses #4

The dataset loading code is taking too long. It downloads whole huge datasets (70G wiki, etc) to use just a handful of examples. setting `split="train[0:2000]")` is not helping since slicing...

Hey @dreaming-panda, This looks really interesting. I wondered if you would be interested to show an integration with Lit-GPT: https://github.com/Lightning-AI/litgpt Best, T.C

Hi Sequoia team, Can this code framework fit in cpu devices? If so, how can we do it? Any insights? Regards