amulil issues

Results 8 issues of


                                            amulil

[WIP][Feature] DPO

@pppppM 佬，按你说的，初步想法是在 dataset 目录下实现 `DPODataset`，在 model 目录下实现 `DPO`，其他 hook 暂时和 sft 一致的，不用修改，但是有一个疑问，DPO 里有 model 和 ref_model 两个 model，deepspeed 相关的部分用修改嘛？

[BUG] Can't `xtuner check-custom-dataset` use?

I use the command `xtuner check-custom-dataset $CONFIG`, and have a error. ```python Traceback (most recent call last): File "/xtuner/xtuner/tools/check_custom_dataset.py", line 157, in main() File "/xtuner/xtuner/tools/check_custom_dataset.py", line 51, in main dataset...

[bug] The datasets can't load successfully when using two nodes in slurm.

```python # reproduce srun -p debug --job-name=xtuner --nodes=2 --gres=gpu:8 --ntasks-per-node=8 --kill-on-bad-exit=1 xtuner train yi_34b_qlora_oasst1_e3_gpu16 --deepspeed deepspeed_zero2 --launcher slurm ``` ```python # loginfo File "/data/miniconda3/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 2261, in broadcast_object_list object_tensor =...

amulil

[WIP][Feature] DPO

[BUG] Can't `xtuner check-custom-dataset` use?

[bug] The datasets can't load successfully when using two nodes in slurm.

[Feature] how to open window attention in qwen-14B?

The sequence parallel is open when I don't use it.

I want to close kv cache. if i set gpu_memory_utilization is 0. Does it means that i close the kv cache?

Does it support Yi-6B/34B-chat?

Is there a recommended way for the math (latex) expression by using tempalte or something else methods?