Vedant Roy
Vedant Roy
If I hit an sglang server in parallel with 100 requests, will it automatically batch the requests to do as many in parallel as possible?
Did you make it by hand?
When running the model--especially in a serverless environment where there may be many cold starts--it would be desirable to cache the auto-tuning results. Is this possible?
I am working on integrating this repository with ComfyUI, so this project can get support for ControlNet, LoRA, etc. all for free from ComfyUI's infrastructure. The main issue is, I...
When running the chat Jupyter notebook, I get the above error. I'm using `transformers==4.35.2`, is that too recent?
Hi, I was wondering if the storage space specified in the README was for the clips, or for all the videos? I've downloaded 67M/70M clips (discarded the videos), but according...
I wrote a downloader using youtube-dlp, but a lot of the IPs get blocked after ~ 10K or so downloads. I'm surprised people are successfully downloading the dataset using the...
Is there a recommended way to use nsys / nsight? I know there's a profiling hook for using the Pytorch profiler, but I'm wondering how to use nsys instead. Can...
Exactly as the question sounds. Is there an async version of touch?
I've noticed when using Pytorch's custom autograd functions, that sometimes the stride of `dO` can be `(0, 0, 0, 0)`. Here's a very simple example: https://discuss.pytorch.org/t/getting-unusual-strides-when-using-pytorchs-autograd/208093. In my custom wrapper...