embedding
embedding
UER-py is a framework for pre-training models (implemented by PyTorch). UER-py facilitates the implementation of existing pre-training models (e.g. BERT), and supports further improvements. UER-py also open-sources a series of...
I meet error with the following command in quickstart with multi-GPU setting: ``` python3 finetune/run_classifier.py --pretrained_model_path models/cluecorpussmall_gatedcnn_lm_model.bin \ --vocab_path models/google_zh_vocab.txt \ --config_path models/gatedcnn_9_config.json \ --train_path datasets/chnsenticorp/train.tsv \ --dev_path datasets/chnsenticorp/dev.tsv \...
Run the following command on CPU machine: ``` python3 finetune/run_classifier.py --pretrained_model_path models/book_review_model.bin \ --vocab_path models/google_zh_vocab.txt \ --config_path models/bert/base_config.json \ --train_path datasets/douban_book_review/train.tsv \ --dev_path datasets/douban_book_review/dev.tsv \ --test_path datasets/douban_book_review/test.tsv \ --epochs_num 3...
Could you provide some examples of using DeepSpeed to train gigantic models with the project?
run_classifier.py每epoch对训练集进行shuffle 但是run_ner.py每epoch没有对训练集进行shuffle
It is often the case that a piece of text is attached with a label, e.g. the comments with the ratings on restaurant. It is beneficial to use the joint...
The project was redesigned while the scripts in HuggingFace uer (https://huggingface.co/uer/) is not changed accordingly. For example, in [RoBERTa page](https://huggingface.co/uer/chinese_roberta_L-2_H-128), The *--data_processor mlm* should be specified in preprocess stage, instead...
The UER-py was redesigned and the way of using the project has changed, but *.github/workflows/github-actions.yml* have not been modified accordingly.
``` CUDA_VISIBLE_DEVICES=0,1 /dockerdata/anaconda3-2/bin/python run_classifier.py --vocab_path models/google_zh_vocab.txt \ --config_path models/gatedcnn_9_config.json \ --train_path datasets/chnsenticorp/train.tsv --dev_path datasets/chnsenticorp/dev.tsv --test_path datasets/chnsenticorp/test.tsv \ --learning_rate 1e-4 --batch_size 64 --epochs_num 5 \ --embedding word --remove_embedding_layernorm --encoder gatedcnn --pooling...
Thank you very much for your contribution. I am not familiar with NMT. Could you provide me the address for downloading the sample.src and sample.trg?