Hasta comments

Results 7 comments of


                                            Hasta

限定依赖库的版本：修正安装环境时的依赖报错问题。

巧的是，我用的torch就是2.0.0， gradio就是3.23.0。暂时还没有发现依赖冲突的问题，你这边的报错是啥？

Can we train llama-13B model with model parallelism?

> Thanks for your interest in LMFlow! I think `configs/ds_config_zero3.json` provides model parallelism (which also uses cpu offload for optimizer states and model parameters) and can be used for your...

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2)

same issue here ```shell If your script expects `--local-rank` argument to be set, please change it to read from `os.environ['LOCAL_RANK']` instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructions warnings.warn( [2023-06-10 13:32:19,445] [INFO]...

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2)

hostfile: ``` 172.17.0.6 slots=8 ```

[Question]: 默认参数微调aquila_chat.py: error: unrecognized arguments: --local-rank=1是什么原因呢

遇到了相同的问题，但不是解决不了，不是一句bug太多就可以吐槽的，可能楼主和我一样用的都是比较新的版本，包括python和torch的版本第一个问题`localrank`的可能是版本问题，传给`aquila_chat.py`的是--local-rank=1,但是解析代码中只认--local_rank=1, 这个可以自己改，见 https://github.com/FlagAI-Open/FlagAI/issues/336#issuecomment-1590618154 第二个问题可能是yaml的版本问题，5.1后load方法和load_all方法都必须加个loader `yaml.load_all(file_data, yaml.FullLoader)`，见 https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation

[Question]: 默认参数微调aquila_chat.py: error: unrecognized arguments: --local-rank=1是什么原因呢

bmt的问题解决了吗，我直接按照readme: ``` git clone https://github.com/OpenBMB/BMTrain cd BMTrain python setup.py install ``` 安装的，目前是可以进行预训练的

`preprocessing_num_workers` can not use in `scripts/run_finetune.sh`

Thank you for your reply. It works well if I do not set `preprocessing_num_workers`. However, I am just curious about why it does not work when this parameter is added....