过拟合 comments

Results 27 comments of


                                            过拟合

generate_instruction.py生成的数据集与Belle.train.json的格式不一致么

最后开源出来的Belle.train.json格式是已经将insttruct和input拼接到一起的格式了。

The step2 scoring looks correct but the step3 model is talking gibberish

Same problem. any update?

Megatron presence detection incorrect

any update?

InitProcessGroupKwargs(timeout=timedelta(seconds=3600)) not work !!!!!

> Please give us the output of `accelerate env` and how you are creating your DataLoaders and Dataset (rough code will work) The key issue is not with the dataset...

InitProcessGroupKwargs(timeout=timedelta(seconds=3600)) not work !!!!!

You can reproduce my issue by doing the following: my dataset tokenizes all data during loading, which takes longer than 30 minutes. I then set the waiting time to be...

InitProcessGroupKwargs(timeout=timedelta(seconds=3600)) not work !!!!!

> ```shell - `Accelerate` version: 0.19.0.dev0 - Platform: Linux-4.19.96-x86_64-with-glibc2.10 - Python version: 3.8.13 - Numpy version: 1.22.4 - PyTorch version (GPU?): 2.0.0+cu117 (True) - System RAM: 503.82 GB - GPU...

InitProcessGroupKwargs(timeout=timedelta(seconds=3600)) not work !!!!!

> [muellerzr](/muellerzr) @muellerzr muellerzr

InitProcessGroupKwargs(timeout=timedelta(seconds=3600)) not work !!!!!

> How are you creating your `Accelerator` object and the `Dataset`? Is it an `IterableDataset`? ```python class customer_dataset: def __init__(self,df): self.df = pd.read_csv(df) self.text = self.df['text'].tolist() self.all_data =tokenizer(self.text) # tokenizer...

InitProcessGroupKwargs(timeout=timedelta(seconds=3600)) not work !!!!!

it seems same issue ? #1129

InitProcessGroupKwargs(timeout=timedelta(seconds=3600)) not work !!!!!

> Hello @bestpredicts, as the config has `zero3_init_flag` set to True, it results in DeepSpeed using default timeout only. And you have mentioned the correct issue with respect to this....