generic-username0718 issues

Repositories
Issues
Comments

Results 4 issues of


                                            generic-username0718

Lonnnnnnnnng context load time before generation

I'm running llama 65b on dual 3090s and at longer contexts I'm noticing seriously long context load times (the time between sending a prompt and tokens actually being received/streamed). It...

Second device doesn't show LoRA loaded

### Describe the bug Load LoRA on desktop LoRA says None on Phone Try changing LoRA to alpaca on phone Reloads llama completely? Still says Lora = None on phone......

bug

generation attempts (for longer replies) clears on every generation attempt

### Describe the bug Generation attempts clear the chat response ### Is there an existing issue for this? - [X] I have searched the existing issues ### Reproduction python3 server.py...

bug

Llama?

New Model out. Any chance it'll be supported by you guys?