dblacknc issues

Results 8 issues of


                                            dblacknc

Max number of objects supported?

I am using Warp in K8s mode with 22 clients. A mix test with total ~250M 4KB objects, or about 11.3M per client, appeared to normally finish the prepare phase...

GPTQ with --cpu: RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

The vicuna-13b-int4 model is running very well on my RTX 3060. Out of curiosity I added --cpu to try and run there for performance comparison. On the first prompt, a...

enhancement

Output with --verbose prints previous response only after next prompt entered

### Describe the bug I expect when adding --verbose to the server.py args, prompts and responses would be printed after being issued, and preferably not duplicated but I get the...

bug

On interface restart, conn. refused then port is in use

### Describe the bug Git pull as of this morning. When I change interface options in the UI and click restart, it seems the web browser immediately does a refresh...

bug

Crash with llava 4bit and --auto-devices

### Describe the bug This is related to #1636 - trying to work around VRAM usage on my 12 GB RTX 3060, and using the 4bit model, trying --gpu-memory 7...

bug

Crash with llava extension + --no-cache

### Describe the bug I'm trying to use the llava extension with my 12 GB RTX3060 card. It's working reasonably well, and I notice the VRAM usage idles at about...

bug

Openassistant is replying to itself - error in prompt tags?

### Describe the bug At times an OpenAssistant model will seemingly prompt and reply to itself, after answering a basic question. It's definitely in Open Assistant mode and I see...

bug

Add RWKV strategy to UI and ability to save with user model data

**Description** When using an RWKV model, the loading strategy must be expressed on the command line, and is often model size-specific. It also can be a relatively complex argument. Add...

enhancement