LaniakeaS
LaniakeaS
**Describe the bug** I got 8 3090 GPUs, 24GB for each. Want to train starchat-beta on full parameters(15B). Clearly, GPUs are insufficient even using ZeRO-3, which leads me to try...
Different batch size doesn't seem to affect GPU memory usage when set in INFERENCE MODE? This doesn't seem to make sense. Is that normal?
os: ubuntu 20.04 platform: docker container exceptions: ``` 2024-07-23 14:33:07,876 [AnyIO worker] [WARNI] Tika server returned status: 404 2024-07-23T06:33:07.896537057Z WARNING [2024-07-23 14:33:07,876] [tika.py:562:callServer] Tika server returned status: 404 2024-07-23T06:33:07.913147602Z ERROR...
**Problem Description** OpenAI API can set `response_format` to control the output format of LLM. I didn't see any implementation on Chatbox. It's there any possible to achieve that?