Swapnil Parekh

Results 18 comments of Swapnil Parekh

+1 this issue. Even using the deep speech 1 lm binary causes massive ram use

HI I would like to contribute as a first timer. Kindly give me some background on how to solve it.

hi I would like to work on this as a first timer. Can u kindly brief me on this?

Hey @yzh119 any update on sm_75 support for punica LORA kernels?

Hey @Yard1, I have addressed your comments on this soft prompt tuning PR. Some updates: - New `adapter_commons` folder with all the common code between `LoRA` and `Prompt Adapters` abstracted...

Hi @Yard1 just a friendly reminder to review this PR when you get a chance, thanks! Once this design is approved, happy to update this with support for prefix tuning...

@Yard1 no worries, thank you! Yes it should work, you can provide both PromptAdapterRequest and LoRaRequest parameters. I just tested a tiny example of this, happy to add a test...

@g-eoj thanks for the OpenAPI PR! Will look at it once I finish the refactor requested by Antoni.

@Rallio67 thanks for testing the PR out. I have tested it with bloom, llama-2 and gptbigcode and it seemed to work for me. Can you please share the lama-3 adapter...

Hey @Rallio67, the prompt works but there is a change pending which would cast the prompt to the model's dtype to make it work out of the box. Currently the...