ColossalAI-Examples
ColossalAI-Examples copied to clipboard
BERT Data Preprocessing
🐛 Describe the bug
NVIDIA DeepLearningExamples removed LDDL from DLE tools on Aug 16, 2022. Therefore, the guide on https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/bert/preprocessing fails to work in the following aspects:
- pip install git+https://github.com/NVIDIA/DeepLearningExamples.git#subdirectory=Tools/lddl won't work. The solution could be either using the new url, i.e. pip install git+https://github.com/NVIDIA/lddl.git, or finding lddl from the history version https://github.com/NVIDIA/DeepLearningExamples/tree/29f5b7ab059025e4ead512e54037eddbdf740f19.
- after installing lddl, using pip install boto3 would lead to a version conflict, which is of unknown effect on the whole process.
- in the preprocessing part, both phase 1 and phase 2 wouldn't work. The details would be provided later.
- changing lddl source from the new url to the history version wouldn't solve the problem 3, not installing boto3 also wouldn't help.
Environment
python=3.8 pytorch=1.12.1 cudatoolkit=10.2.89 cuda=10.2
More details about problem 2:
install lddl first, then install boto3:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. awscli 1.22.55 requires botocore==1.24.0, but you have botocore 1.27.61 which is incompatible. awscli 1.22.55 requires s3transfer<0.6.0,>=0.5.0, but you have s3transfer 0.6.0 which is incompatible.
after install boto3, reinstall lddl:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. boto3 1.24.61 requires botocore<1.28.0,>=1.27.61, but you have botocore 1.24.0 which is incompatible. boto3 1.24.61 requires s3transfer<0.7.0,>=0.6.0, but you have s3transfer 0.5.2 which is incompatible.
More details about problem 3: output.txt
You can try pip install
with the --no-dependencies
flag to ignore the dependency conflict.