Results 30 issues of QinLuo

### 🐛 Describe the bug The type of `error_msgs` is str, we should not re-join it using `\n\t`. https://github.com/hpcaitech/ColossalAI/blob/d83c633ca63c4eef49f3473aa998515fa5ca573f/colossalai/checkpoint_io/general_checkpoint_io.py#L228 otherwise it led to a weird output ``` n\t\'\n\ts\n\tc\n\to\n\tr\n\te\n\t.\n\tw\n\te\n\ti\n\tg\n\th\n\tt\n\t\'\n\t]\n\t"\n\t.\n\t ``` ###...

bug

### 🐛 Describe the bug While boosting the model using the `torch_fsdp` plugin and `LazyInitContext`, a RecursionError occurred: `RecursionError: maximum recursion depth exceeded` script: ``` from modeling_phi import PhiDecoderLayer, PhiForCausalLM...

bug

Through https://github.com/wandb/local/issues/64#issuecomment-1106030481, I got in --------------------------------- Pre: - Docker version 20.10.12, build e91ed57 - wandb, version 0.12.14 - wandb local image latest With the following actions: ``` # start a...

We tested it following the blog [Leveraging TensorFlow-TensorRT integration for Low latency Inference](https://blog.tensorflow.org/2021/01/leveraging-tensorflow-tensorrt-integration.html), and got **a very large saved model** **ENV** - Tensorflow: 2.4.1 - TensorRT: 6.0.1 - Cuda: 10.1...

参数与 test_inference.sh 一致。 使用 test_prompt.txt 输入,但生成后,有很多额外的题和代码 ``` Current prompt: code translation Java: public class Solution { public static boolean hasCloseElements(int[] nums, int threshold) { for (int i = 0; i...

**Describe the bug** with the newest version of patchelf (pathelf-0.11 and master branch), remove `so` will cause a bug ``` Inconsistency detected by ld.so: dl-version.c: 205: _dl_check_map_versions: Assertion `needed !=...

bug

Error occurred when compile c++ code, `ltp/thirdparty/jsoncpp/src/lib_json/json_reader.cpp` For Linux `‘thread_local’ does not name a type` just try to use a compiler that supports c++11 For Mac `thread-local storage is not...

## 🚀 Feature Request The preference data looks like this: ``` { "chosen": [ {"role": "user", "content": "abcd"}, {"role": "assistant", "content": "abcef"}, ... ], "rejected": [ {"role": "user", "content": "abcd"},...

enhancement

### Description / 描述 tech report 有这个实验, 那有对比过这样的效果吗 A0 预训练数据退火 B0 预训练数据+SFT数据退火 A1 预训练数据退火 + 4B sft B1 预训练数据+SFT数据退火 -> 4B sft ### Case Explaination / 案例解释 _No response_

badcase