drxmy comments

Results 8 comments of


                                            drxmy

4bit inference is slow

> I think Tim is working on the 4bit inference kernel which hopefully will be available in the coming weeks During inference, Will the model also convert between fp16 and...

is it possible to change the ui a little bit by myself?

For example, my model can do many different tasks like rewriting, GEC or some NLU. In the simplest case, these tasks are sloved by different prompts which are not visible...

is it possible to change the ui a little bit by myself?

I will give it try first. Thank you!

> For LLAMA or other generative AI model needs, you may check out HippoML: https://blog.hippoml.com/large-language-model-inference-from-datacenter-to-edge-ed2f94da4a81 > > @drxmy @dhawalkp Thank you! I just joined the waitlist. Is this another open...

drxmy

4bit inference is slow

is it possible to change the ui a little bit by myself?

is it possible to change the ui a little bit by myself?

Example Models Wishlist

Example Models Wishlist

Fintuning 176B Bloom with lora

[QUESTION] Should llama or gpt-like models have padding attention mask?

[QUESTION] Should llama or gpt-like models have padding attention mask?