SynGEC icon indicating copy to clipboard operation
SynGEC copied to clipboard

How can I get the data?

Open YeJinPaark opened this issue 1 year ago • 9 comments

First, I downloaded 'Transformer-en' and renamed it like './model/syngec/english_transformer_baseline.pt'. Then, I downloaded the preprocessed data.

And I run the code './pipeline_gopar.sh'. But the error is:

Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 17, in input_sentences = load(sys.argv[1]) File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 9, in load with open(filename, 'r') as f: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt' Loading resources... Processing parallel files... Traceback (most recent call last): File "/opt/conda/bin/errant_parallel", line 8, in sys.exit(main()) File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in main in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor] File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor] FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/convert_gec_data_to_parsing_data_english.py", line 153, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt.conll_predict' /opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py:181: FutureWarning: The module torch.distributed.launch is deprecated and will be removed in future. Use torchrun. Note that --use-env is set by default in torchrun. If your script expects --local-rank argument to be set, please change it to read from os.environ['LOCAL_RANK'] instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructions

warnings.warn( WARNING:torch.distributed.run:


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


/opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 30676) of binary: /opt/conda/bin/python Traceback (most recent call last): File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 196, in main() File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 192, in main launch(args) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 177, in launch run(args) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run elastic_launch( File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

supar.cmds.biaffine_dep FAILED

Failures: [1]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 1 (local_rank: 1) exitcode : 1 (pid: 30677) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [2]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 2 (local_rank: 2) exitcode : 1 (pid: 30678) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [3]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 3 (local_rank: 3) exitcode : 1 (pid: 30679) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [4]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 4 (local_rank: 4) exitcode : 1 (pid: 30680) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [5]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 5 (local_rank: 5) exitcode : 1 (pid: 30681) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [6]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 6 (local_rank: 6) exitcode : 1 (pid: 30682) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [7]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 7 (local_rank: 7) exitcode : 1 (pid: 30683) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure): [0]: time : 2023-08-24_08:10:36 host : 309e7fc0781e rank : 0 (local_rank: 0) exitcode : 1 (pid: 30676) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out'

How can I fix it? plz help me...

YeJinPaark avatar Aug 24 '23 08:08 YeJinPaark

I notice that the file path in your error message seems strange, such as ``/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py''. Please try to enter the corresponding directory and then re-run the bash file.

HillZhang1999 avatar Aug 25 '23 07:08 HillZhang1999

Ok, I'll try soon

Then, I wonder that the how can I get the data like:

FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt'

YeJinPaark avatar Aug 25 '23 08:08 YeJinPaark

You should download the preprocessed data, unzip them, and put them into https://github.com/HillZhang1999/SynGEC/tree/main/data

HillZhang1999 avatar Aug 25 '23 08:08 HillZhang1999

Is that preprocessed data same the link of data: https://drive.google.com/file/d/1dIDfYhELrh3BEKgGpsPYAy5ehcobmMov/view

So I downloaded the data and unzip ./data/ but I got the error like

Apply BPE... ./preprocess_syngec_transformer.sh: line 22: ../../data/clang8_train/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 23: ../../data/clang8_train/tgt.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 24: ../../data/bea19_dev/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 25: ../../data/bea19_dev/tgt.txt: No such file or directory Align subwords and words... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt' cp: cannot stat '../../data/clang8_train/src.txt': No such file or directory cp: cannot stat '../../data/clang8_train/src.txt.bpe': No such file or directory cp: cannot stat '../../data/clang8_train/tgt.txt': No such file or directory cp: cannot stat '../../data/clang8_train/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/clang8_train/src.txt.swm': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.swm': No such file or directory /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' cp: cannot stat '../../data/clang8_train/src.txt.conll_predict_gopar_np': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np': No such file or directory Calculate dependency distance... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt.conll_predict_gopar' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.conll_predict_gopar' cp: cannot stat '../../data/clang8_train/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/clang8_train/src.txt.conll_predict_gopar_np.probs': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.probs': No such file or directory Preprocess... /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu] [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile] [--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}] [--tokenizer {space,moses,nltk}] [--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}] [--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}] [--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}] [--scoring {wer,sacrebleu,bleu,chrf}] [--task TASK] [-s SRC] [-t TARGET] [--source-lang-with-nt SRC] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--conll-suffix FP [FP ...]] [--dpd-suffix FP [FP ...]] [--probs-suffix FP [FP ...]] [--swm-suffix FP] [--destdir DIR] [--thresholdtgt N] [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--labeldict FP [FP ...]] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary] [--only-source] [--padding-factor N] [--workers N] preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model' Finished! Apply BPE... ./preprocess_syngec_transformer.sh: line 112: ../../data/error_coded_train/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 113: ../../data/error_coded_train/tgt.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 114: ../../data/bea19_dev/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 115: ../../data/bea19_dev/tgt.txt: No such file or directory Align subwords and words... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt' cp: cannot stat '../../data/error_coded_train/src.txt': No such file or directory cp: cannot stat '../../data/error_coded_train/src.txt.bpe': No such file or directory cp: cannot stat '../../data/error_coded_train/tgt.txt': No such file or directory cp: cannot stat '../../data/error_coded_train/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/error_coded_train/src.txt.swm': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.swm': No such file or directory /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' cp: cannot stat '../../data/error_coded_train/src.txt.conll_predict_gopar_np': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np': No such file or directory Calculate dependency distance... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt.conll_predict_gopar' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.conll_predict_gopar' cp: cannot stat '../../data/error_coded_train/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/error_coded_train/src.txt.conll_predict_gopar_np.probs': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.probs': No such file or directory Preprocess... /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu] [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile] [--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}] [--tokenizer {space,moses,nltk}] [--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}] [--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}] [--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}] [--scoring {wer,sacrebleu,bleu,chrf}] [--task TASK] [-s SRC] [-t TARGET] [--source-lang-with-nt SRC] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--conll-suffix FP [FP ...]] [--dpd-suffix FP [FP ...]] [--probs-suffix FP [FP ...]] [--swm-suffix FP] [--destdir DIR] [--thresholdtgt N] [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--labeldict FP [FP ...]] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary] [--only-source] [--padding-factor N] [--workers N] preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model' Finished! Apply BPE... ./preprocess_syngec_transformer.sh: line 202: ../../data/wi_locness_train/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 203: ../../data/wi_locness_train/tgt.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 204: ../../data/bea19_dev/src.txt: No such file or directory ./preprocess_syngec_transformer.sh: line 205: ../../data/bea19_dev/tgt.txt: No such file or directory Align subwords and words... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt' cp: cannot stat '../../data/wi_locness_train/src.txt': No such file or directory cp: cannot stat '../../data/wi_locness_train/src.txt.bpe': No such file or directory cp: cannot stat '../../data/wi_locness_train/tgt.txt': No such file or directory cp: cannot stat '../../data/wi_locness_train/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt': No such file or directory cp: cannot stat '../../data/bea19_dev/tgt.txt.bpe': No such file or directory cp: cannot stat '../../data/wi_locness_train/src.txt.swm': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.swm': No such file or directory /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm' cp: cannot stat '../../data/wi_locness_train/src.txt.conll_predict_gopar_np': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np': No such file or directory Calculate dependency distance... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt.conll_predict_gopar' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.conll_predict_gopar' cp: cannot stat '../../data/wi_locness_train/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/wi_locness_train/src.txt.conll_predict_gopar_np.probs': No such file or directory cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.probs': No such file or directory Preprocess... /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu] [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile] [--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}] [--tokenizer {space,moses,nltk}] [--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}] [--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}] [--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}] [--scoring {wer,sacrebleu,bleu,chrf}] [--task TASK] [-s SRC] [-t TARGET] [--source-lang-with-nt SRC] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--conll-suffix FP [FP ...]] [--dpd-suffix FP [FP ...]] [--probs-suffix FP [FP ...]] [--swm-suffix FP] [--destdir DIR] [--thresholdtgt N] [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--labeldict FP [FP ...]] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary] [--only-source] [--padding-factor N] [--workers N] preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model' Finished! Apply BPE... ./preprocess_syngec_transformer.sh: line 290: ../../data/conll14_test/src.txt: No such file or directory Align subwords and words... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt' cp: cannot stat '../../data/conll14_test/src.txt': No such file or directory cp: cannot stat '../../data/conll14_test/src.txt.bpe': No such file or directory cp: cannot stat '../../data/conll14_test/src.txt.swm': No such file or directory /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt.swm' cp: cannot stat '../../data/conll14_test/src.txt.conll_predict_gopar_np': No such file or directory Calculate dependency distance... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt.conll_predict_gopar' cp: cannot stat '../../data/conll14_test/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/conll14_test/src.txt.conll_predict_gopar_np.probs': No such file or directory Preprocess... /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu] [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile] [--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}] [--tokenizer {space,moses,nltk}] [--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}] [--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}] [--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}] [--scoring {wer,sacrebleu,bleu,chrf}] [--task TASK] [-s SRC] [-t TARGET] [--source-lang-with-nt SRC] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--conll-suffix FP [FP ...]] [--dpd-suffix FP [FP ...]] [--probs-suffix FP [FP ...]] [--swm-suffix FP] [--destdir DIR] [--thresholdtgt N] [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--labeldict FP [FP ...]] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary] [--only-source] [--padding-factor N] [--workers N] preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model' Finished! Apply BPE... ./preprocess_syngec_transformer.sh: line 358: ../../data/bea19_test/src.txt: No such file or directory Align subwords and words... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in with open(file_word, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt' cp: cannot stat '../../data/bea19_test/src.txt': No such file or directory cp: cannot stat '../../data/bea19_test/src.txt.bpe': No such file or directory cp: cannot stat '../../data/bea19_test/src.txt.swm': No such file or directory /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt.swm' /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()] FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt.swm' cp: cannot stat '../../data/bea19_test/src.txt.conll_predict_gopar_np': No such file or directory Calculate dependency distance... Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt.conll_predict_gopar' cp: cannot stat '../../data/bea19_test/src.txt.conll_predict_gopar_np.dpd': No such file or directory cp: cannot stat '../../data/bea19_test/src.txt.conll_predict_gopar_np.probs': No such file or directory Preprocess... /opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") usage: preprocess.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format LOG_FORMAT] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--seed SEED] [--cpu] [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE] [--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ] [--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE] [--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile] [--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}] [--tokenizer {space,moses,nltk}] [--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}] [--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}] [--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}] [--scoring {wer,sacrebleu,bleu,chrf}] [--task TASK] [-s SRC] [-t TARGET] [--source-lang-with-nt SRC] [--trainpref FP] [--validpref FP] [--testpref FP] [--align-suffix FP] [--conll-suffix FP [FP ...]] [--dpd-suffix FP [FP ...]] [--probs-suffix FP [FP ...]] [--swm-suffix FP] [--destdir DIR] [--thresholdtgt N] [--thresholdsrc N] [--tgtdict FP] [--srcdict FP] [--labeldict FP [FP ...]] [--nwordstgt N] [--nwordssrc N] [--alignfile ALIGN] [--dataset-impl FORMAT] [--joined-dictionary] [--only-source] [--padding-factor N] [--workers N] preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model' Finished!

YeJinPaark avatar Aug 25 '23 10:08 YeJinPaark

please enter the directory of this bash file, then run cat ../../data/clang8_train/src.txt, check whether there is actually a file. If not, please check the way you unzip the data.

HillZhang1999 avatar Aug 25 '23 11:08 HillZhang1999

First I use the unzip like "tar -zxvf syngec_preprocess.tar.gz"

and then the log is preprocess/ preprocess/chinese_hsk+lang8_with_syntax_transformer/ preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/ preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/dict.label.txt preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/dict.src.txt preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/dict.tgt.txt preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/preprocess.log preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.conll.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.conll.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.probs.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.probs.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.tgt.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.tgt.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.src.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.src.idx preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.tgt.bin preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.tgt.idx preprocess/chinese_mucgec_with_syntax_transformer/ preprocess/chinese_mucgec_with_syntax_transformer/bin/ preprocess/chinese_mucgec_with_syntax_transformer/bin/dict.label.txt preprocess/chinese_mucgec_with_syntax_transformer/bin/dict.src.txt preprocess/chinese_mucgec_with_syntax_transformer/bin/preprocess.log preprocess/chinese_mucgec_with_syntax_transformer/bin/test.conll.src-tgt.src.bin preprocess/chinese_mucgec_with_syntax_transformer/bin/test.conll.src-tgt.src.idx preprocess/chinese_mucgec_with_syntax_transformer/bin/test.dpd.src-tgt.src.bin preprocess/chinese_mucgec_with_syntax_transformer/bin/test.dpd.src-tgt.src.idx preprocess/chinese_mucgec_with_syntax_transformer/bin/test.probs.src-tgt.src.bin preprocess/chinese_mucgec_with_syntax_transformer/bin/test.probs.src-tgt.src.idx preprocess/chinese_mucgec_with_syntax_transformer/bin/test.src-tgt.src.bin preprocess/chinese_mucgec_with_syntax_transformer/bin/test.src-tgt.src.idx preprocess/english_bea19_with_syntax_bart/ preprocess/english_bea19_with_syntax_bart/bin/ preprocess/english_bea19_with_syntax_bart/bin/dict.label.txt preprocess/english_bea19_with_syntax_bart/bin/dict.src.txt preprocess/english_bea19_with_syntax_bart/bin/preprocess.log preprocess/english_bea19_with_syntax_bart/bin/test.conll.src-tgt.src.bin preprocess/english_bea19_with_syntax_bart/bin/test.conll.src-tgt.src.idx preprocess/english_bea19_with_syntax_bart/bin/test.dpd.src-tgt.src.bin preprocess/english_bea19_with_syntax_bart/bin/test.dpd.src-tgt.src.idx preprocess/english_bea19_with_syntax_bart/bin/test.probs.src-tgt.src.bin preprocess/english_bea19_with_syntax_bart/bin/test.probs.src-tgt.src.idx preprocess/english_bea19_with_syntax_bart/bin/test.src-tgt.src.bin preprocess/english_bea19_with_syntax_bart/bin/test.src-tgt.src.idx preprocess/english_bea19_with_syntax_transformer/ preprocess/english_bea19_with_syntax_transformer/dict.label.txt preprocess/english_bea19_with_syntax_transformer/dict.src.txt preprocess/english_bea19_with_syntax_transformer/preprocess.log preprocess/english_bea19_with_syntax_transformer/test.conll.src-tgt.src.bin preprocess/english_bea19_with_syntax_transformer/test.conll.src-tgt.src.idx preprocess/english_bea19_with_syntax_transformer/test.dpd.src-tgt.src.bin preprocess/english_bea19_with_syntax_transformer/test.dpd.src-tgt.src.idx preprocess/english_bea19_with_syntax_transformer/test.probs.src-tgt.src.bin preprocess/english_bea19_with_syntax_transformer/test.probs.src-tgt.src.idx preprocess/english_bea19_with_syntax_transformer/test.src-tgt.src.bin preprocess/english_bea19_with_syntax_transformer/test.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/ preprocess/english_clang8_with_syntax_bart/bin/ preprocess/english_clang8_with_syntax_bart/bin/dict.label.txt preprocess/english_clang8_with_syntax_bart/bin/dict.src.txt preprocess/english_clang8_with_syntax_bart/bin/dict.tgt.txt preprocess/english_clang8_with_syntax_bart/bin/preprocess.log preprocess/english_clang8_with_syntax_bart/bin/train.conll.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/train.conll.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/train.dpd.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/train.dpd.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/train.probs.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/train.probs.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.tgt.bin preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.tgt.idx preprocess/english_clang8_with_syntax_bart/bin/valid.conll.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/valid.conll.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/valid.dpd.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/valid.dpd.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/valid.probs.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/valid.probs.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.src.bin preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.src.idx preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.tgt.bin preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.tgt.idx preprocess/english_clang8_with_syntax_transformer/ preprocess/english_clang8_with_syntax_transformer/bin/ preprocess/english_clang8_with_syntax_transformer/bin/dict.label.txt preprocess/english_clang8_with_syntax_transformer/bin/dict.src.txt preprocess/english_clang8_with_syntax_transformer/bin/dict.tgt.txt preprocess/english_clang8_with_syntax_transformer/bin/preprocess.log preprocess/english_clang8_with_syntax_transformer/bin/train.conll.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/train.conll.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/train.probs.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/train.probs.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.tgt.bin preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.tgt.idx preprocess/english_clang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.src.bin preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.src.idx preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.tgt.bin preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.tgt.idx preprocess/english_conll14_with_syntax_bart/ preprocess/english_conll14_with_syntax_bart/bin/ preprocess/english_conll14_with_syntax_bart/bin/dict.label.txt preprocess/english_conll14_with_syntax_bart/bin/dict.src.txt preprocess/english_conll14_with_syntax_bart/bin/preprocess.log preprocess/english_conll14_with_syntax_bart/bin/test.conll.src-tgt.src.bin preprocess/english_conll14_with_syntax_bart/bin/test.conll.src-tgt.src.idx preprocess/english_conll14_with_syntax_bart/bin/test.dpd.src-tgt.src.bin preprocess/english_conll14_with_syntax_bart/bin/test.dpd.src-tgt.src.idx preprocess/english_conll14_with_syntax_bart/bin/test.probs.src-tgt.src.bin preprocess/english_conll14_with_syntax_bart/bin/test.probs.src-tgt.src.idx preprocess/english_conll14_with_syntax_bart/bin/test.src-tgt.src.bin preprocess/english_conll14_with_syntax_bart/bin/test.src-tgt.src.idx preprocess/english_conll14_with_syntax_transformer/ preprocess/english_conll14_with_syntax_transformer/dict.label.txt preprocess/english_conll14_with_syntax_transformer/dict.src.txt preprocess/english_conll14_with_syntax_transformer/preprocess.log preprocess/english_conll14_with_syntax_transformer/test.conll.src-tgt.src.bin preprocess/english_conll14_with_syntax_transformer/test.conll.src-tgt.src.idx preprocess/english_conll14_with_syntax_transformer/test.dpd.src-tgt.src.bin preprocess/english_conll14_with_syntax_transformer/test.dpd.src-tgt.src.idx preprocess/english_conll14_with_syntax_transformer/test.probs.src-tgt.src.bin preprocess/english_conll14_with_syntax_transformer/test.probs.src-tgt.src.idx preprocess/english_conll14_with_syntax_transformer/test.src-tgt.src.bin preprocess/english_conll14_with_syntax_transformer/test.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/ preprocess/english_error_coded_with_syntax_bart/bin/ preprocess/english_error_coded_with_syntax_bart/bin/dict.label.txt preprocess/english_error_coded_with_syntax_bart/bin/dict.src.txt preprocess/english_error_coded_with_syntax_bart/bin/dict.tgt.txt preprocess/english_error_coded_with_syntax_bart/bin/preprocess.log preprocess/english_error_coded_with_syntax_bart/bin/train.conll.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/train.conll.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/train.dpd.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/train.dpd.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/train.probs.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/train.probs.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.tgt.bin preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.tgt.idx preprocess/english_error_coded_with_syntax_bart/bin/valid.conll.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/valid.conll.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/valid.dpd.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/valid.dpd.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/valid.probs.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/valid.probs.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.src.bin preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.src.idx preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.tgt.bin preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.tgt.idx preprocess/english_error_coded_with_syntax_transformer/ preprocess/english_error_coded_with_syntax_transformer/bin/ preprocess/english_error_coded_with_syntax_transformer/bin/dict.label.txt preprocess/english_error_coded_with_syntax_transformer/bin/dict.src.txt preprocess/english_error_coded_with_syntax_transformer/bin/dict.tgt.txt preprocess/english_error_coded_with_syntax_transformer/bin/preprocess.log preprocess/english_error_coded_with_syntax_transformer/bin/train.conll.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/train.conll.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/train.probs.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/train.probs.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.tgt.bin preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.tgt.idx preprocess/english_error_coded_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.src.bin preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.src.idx preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.tgt.bin preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.tgt.idx preprocess/english_wi_locness_with_syntax_bart/ preprocess/english_wi_locness_with_syntax_bart/bin/ preprocess/english_wi_locness_with_syntax_bart/bin/dict.label.txt preprocess/english_wi_locness_with_syntax_bart/bin/dict.src.txt preprocess/english_wi_locness_with_syntax_bart/bin/dict.tgt.txt preprocess/english_wi_locness_with_syntax_bart/bin/preprocess.log preprocess/english_wi_locness_with_syntax_bart/bin/train.conll.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/train.conll.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/train.dpd.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/train.dpd.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/train.probs.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/train.probs.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.tgt.bin preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.tgt.idx preprocess/english_wi_locness_with_syntax_bart/bin/valid.conll.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/valid.conll.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/valid.dpd.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/valid.dpd.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/valid.probs.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/valid.probs.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.tgt.bin preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.tgt.idx preprocess/english_wi_locness_with_syntax_transformer/ preprocess/english_wi_locness_with_syntax_transformer/bin/ preprocess/english_wi_locness_with_syntax_transformer/bin/dict.label.txt preprocess/english_wi_locness_with_syntax_transformer/bin/dict.src.txt preprocess/english_wi_locness_with_syntax_transformer/bin/dict.tgt.txt preprocess/english_wi_locness_with_syntax_transformer/bin/preprocess.log preprocess/english_wi_locness_with_syntax_transformer/bin/train.conll.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/train.conll.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/train.probs.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/train.probs.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.tgt.bin preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.tgt.idx preprocess/english_wi_locness_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.src.bin preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.src.idx preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.tgt.bin preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.tgt.idx

and I run the bash file:

root@309e7fc0781e:/mnt/ssd_mnt/pyj/SynGEC/data# cd /mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/ root@309e7fc0781e:/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp# ls generate_syngec_bart.sh preprocess_syngec_bart.sh generate_syngec_transformer.sh preprocess_syngec_transformer.sh nohup.out train_syngec_bart.sh pipeline_gopar.sh train_syngec_transformer.sh root@309e7fc0781e:/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp# ./pipeline_gopar.sh Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 17, in input_sentences = load(sys.argv[1]) File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 9, in load with open(filename, 'r') as f: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt' Loading resources... Processing parallel files... Traceback (most recent call last): File "/opt/conda/bin/errant_parallel", line 8, in sys.exit(main()) File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in main in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor] File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor] FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt' Traceback (most recent call last): File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/convert_gec_data_to_parsing_data_english.py", line 153, in with open(conll_file, "r") as f1: FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt.conll_predict' /opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py:181: FutureWarning: The module torch.distributed.launch is deprecated and will be removed in future. Use torchrun. Note that --use-env is set by default in torchrun. If your script expects --local-rank argument to be set, please change it to read from os.environ['LOCAL_RANK'] instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructions

warnings.warn( WARNING:torch.distributed.run:


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


/opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep /opt/conda/bin/python: No module named supar.cmds.biaffine_dep ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 34107) of binary: /opt/conda/bin/python Traceback (most recent call last): File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 196, in main() File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 192, in main launch(args) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 177, in launch run(args) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run elastic_launch( File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

supar.cmds.biaffine_dep FAILED

Failures: [1]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 1 (local_rank: 1) exitcode : 1 (pid: 34108) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [2]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 2 (local_rank: 2) exitcode : 1 (pid: 34109) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [3]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 3 (local_rank: 3) exitcode : 1 (pid: 34110) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [4]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 4 (local_rank: 4) exitcode : 1 (pid: 34111) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [5]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 5 (local_rank: 5) exitcode : 1 (pid: 34112) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [6]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 6 (local_rank: 6) exitcode : 1 (pid: 34113) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [7]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 7 (local_rank: 7) exitcode : 1 (pid: 34114) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure): [0]: time : 2023-08-25_12:30:09 host : 309e7fc0781e rank : 0 (local_rank: 0) exitcode : 1 (pid: 34107) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out' nohup: appending output to 'nohup.out'

what's the problem and how can I fix it?

YeJinPaark avatar Aug 25 '23 12:08 YeJinPaark

If you don't want to re-train the parser, you can directly skip the data preprocess step. The preprocessed file can be directly downloaded from our Google Drive. If you want to re-train the parser, you must download the required datasets from their official websites, and put them into the corresponding director (src.txt, tgt.txt, one sentence one line).

HillZhang1999 avatar Aug 29 '23 05:08 HillZhang1999

您好,请问您解决了这个问题吗,我解压缩后也没有明确的src.txt和tgt.txt,解压缩的文件是这样的,请问该如何做呢 屏幕截图 2024-01-17 203001

hwlys avatar Jan 17 '24 12:01 hwlys

您好,请问您解决了这个问题吗,我解压缩后也没有明确的src.txt和tgt.txt,解压缩的文件是这样的,请问该如何做呢 屏幕截图 2024-01-17 203001

由于版权问题,我们没有提供文本文件,只有处理好的二进制文件,可以直接拿来训练

HillZhang1999 avatar Jan 18 '24 07:01 HillZhang1999