Jack BAI

Results 11 issues of Jack BAI

Hi, I'm wondering what tools to use for visualizing JSON. Do you use a NoSQL database to create the data and then export it? Or do you simply use some...

Excuse me, just want to ask whether there's any progress on Eval zero-shot perplexities on standard evals (e.g. LAMBADA? HELM? etc.). We're using this repo for downstream eval to show...

I observe that the loss converges around 100000 steps. Why do we need to further train the model until 600000 steps?

The simulator model text-davinci-003 is now deprecated and the other models (babbage-002 and ada-002) are super unreliable. Is there a workaround on this?

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... PyTorch version: 2.6.0+cu124 Is debug build: False CUDA used to build PyTorch: 12.4 ROCM used to build...

bug

@zRzRzRzRzRzRzR ### System Info / 系統信息 python 3.10.0, Transformer 4.36.2, Linux ### Who can help? / 谁可以帮助到您? _No response_ ### Information / 问题信息 - [X] The official example scripts /...

如图,我想请问一下给Qwen teacher forcing的image ground truth token是什么呢?图片中用同样的虚方格代替了,是否说明所有这些gt token都是某一个特殊的token id?

**Describe the bug** In some use cases, we have to delete the training engine after training and load it again after some operations. What is the correct way to delete...

bug
training