One comments

Results 109 comments of

One

`pip install ochat` - name 'nvcc_cuda_version' is not defined

Have you tried Transformers or vLLM? PyTorch compatibility with CUDA 12.3 is experimental.

Question for https://huggingface.co/openchat/openchat-3.5-0106/blob/main/openchat.json

It's 3.5-0106. Because it's using the openchat 3.2's conversation template and Mistral base model.

openchat3.5 training data formatting

`Correct` means verified correct answers. Besides, `GPT4` and `Human` were also used, indicating data with unknown correctness.

openchat3.5 training data formatting

@bpucla 1. Yes, and `Human User` `Human Assistant` 2. Yes. GPT-3.5 data is discarded in the 3.5 version

Has the "Code User", "Code Assistant" prompt structure been deprecated in the latest models?

Yes, it's deprecated now. Use `GPT4 Correct User` for best coding performance.

Incomplete Output even with max_new_tokens

`max_new_tokens` limits the generated tokens of the model. If it outputs more than 300 tokens, the generation will stop. If you want shorter responses, you may prompt the model to...

Cannot find OpenChat 3.5 1210 version

Sorry for the inconvenience. We've updated it, should be published now.

Question about `--per-sequence-loss`

When this parameter is enabled, losses are averaged on a per-sequence basis, otherwise on a per-token basis (same as HF trainer). It is disabled by default because it causes worse...

DownloadError

It may be a network problem, try downloading again? If it still fails, please paste the error message.

class OpenchatDataset will cause CPU OOM for loading whole dataset at one time, cause DeepSpeed error return code = -9.

Thanks for your report! Yes, this is exactly the issue. Because SFT datasets are often small, and CPU RAM is abundant, we prefetch the dataset into memory. This implementation takes...