nl2bash icon indicating copy to clipboard operation
nl2bash copied to clipboard

Assertion error during training

Open uk-ci-github opened this issue 4 years ago • 9 comments

Hello, I'm trying to reproduce the paper results, but I get an assertion error after few epochs while running "make train". Do you have any suggestion? Thanks!

Example 700
Original Source: b'Write unbuffered output of "python -u client.py" to standard output and to "logfile"'
Source: [b'w', b'r', b'i', b't', b'e', b' ', b'_', b'_', b'S', b'P', b'_', b'_', b'U', b'N', b'K', b' ', b'o', b'u', b't', b'p', b'u', b't', b' ', b'o', b'f', b' ', b'_', b'_', b'S', b'P', b'_', b'_', b'U', b'N', b'K', b' ', b'
t', b'o', b' ', b's', b't', b'a', b'n', b'd', b'a', b'r', b'd', b' ', b'o', b'u', b't', b'p', b'u', b't', b' ', b'a', b'n', b'd', b' ', b't', b'o', b' ', b'_', b'_', b'S', b'P', b'_', b'_', b'U', b'N', b'K']
GT Target 1: b'python -u client.py | tee logfile'
Prediction 1: b'' (0.0, 0)
Prediction 2: b'echo __SP__UNK | tee __SP__UNK' (0.36514837167011077, 0.2033717397090786)
Prediction 3: b'ls -l -R __SP__UNK | tee __SP__UNK' (0.3086066999241838, 0.18043239916836057)

Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 378, in <module>
    tf.compat.v1.app.run()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 374, in main
    eval(dataset, verbose=True)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 176, in eval
    return eval_tools.automatic_eval(prediction_path, dataset, top_k=3, FLAGS=FLAGS, verbose=verbose)
  File "/root/workspace/nl2bash/eval/eval_tools.py", line 249, in automatic_eval
    top_k, num_samples, verbose)
  File "/root/workspace/nl2bash/eval/eval_tools.py", line 361, in get_automatic_evaluation_metrics
    bleu = token_based.corpus_bleu_score(command_gt_asts_list, pred_ast_list)
  File "/root/workspace/nl2bash/eval/token_based.py", line 70, in corpus_bleu_score
    gt_tokens_list = [[data_tools.bash_tokenizer(ast, ignore_flag_order=True) for ast in gt_asts] for gt_asts in gt_asts_list]                                                                                                       File "/root/workspace/nl2bash/eval/token_based.py", line 70, in <listcomp>
    gt_tokens_list = [[data_tools.bash_tokenizer(ast, ignore_flag_order=True) for ast in gt_asts] for gt_asts in gt_asts_list]
  File "/root/workspace/nl2bash/eval/token_based.py", line 70, in <listcomp>
    gt_tokens_list = [[data_tools.bash_tokenizer(ast, ignore_flag_order=True) for ast in gt_asts] for gt_asts in gt_asts_list]                                                                                                       File "/root/workspace/nl2bash/bashlint/data_tools.py", line 58, in bash_tokenizer
    with_prefix=with_prefix, with_flag_argtype=with_flag_argtype)
  File "/root/workspace/nl2bash/bashlint/data_tools.py", line 250, in ast2tokens
    return to_tokens_fun(node)                                                                                                                                                                                                       File "/root/workspace/nl2bash/bashlint/data_tools.py", line 102, in to_tokens_fun
    assert(loose_constraints or node.get_num_of_children() == 1)
AssertionError
Makefile:41: recipe for target 'train' failed                                                                                                                                                                                      make: *** [train] Error 1

uk-ci-github avatar Feb 21 '20 14:02 uk-ci-github

The error is triggered by this line: line: https://github.com/TellinaTool/nl2bash/blob/master/eval/token_based.py#L70

You may temporarily bypass this by changing line 70 to

gt_tokens_list = [[data_tools.bash_tokenizer(ast, loose_constraints=True, ignore_flag_order=True) for ast in gt_asts] for gt_asts in gt_asts_list]

The loose_constraints flag tells the Bash tokenizer to produce a tokenization despite of parse tree errors encountered. Without setting this flag it will throws an assertion.

It looks like something might be wrong with the data since this error shouldn't be triggered when parsing a ground truth command.

Looks like it got choked on this one

python -u client.py | tee logfile

I'm going to debug the bash_tokenizer further and get back to you.

todpole3 avatar Mar 05 '20 09:03 todpole3

Thanks a lot for your suggestion, I'm gonna try it in the meanwhile.

uk-ci-github avatar Mar 05 '20 09:03 uk-ci-github

Hello sir. When I restored the code, I had the same problem. At the same time, I had modified loose_constraints, but I still got an error. I hope you can guide me. Thank you very much.

Makefile:41: recipe for target 'train' failed make: *** [train] Error 1

cjy-cc avatar Mar 31 '20 10:03 cjy-cc

Hello, after the addition of loose_constraints=True I managed to successfully train 5 models out of 7. Their performances are much lower than the one reported in NL2Bash's Table 15 for the automatic evaluation on the dev set. I'm not sure if the BLEU score is correctly computed because I get suspicious warnings. For example bash-token.sh --decode --gpu 0 reports

[...]
The hypothesis contains 0 counts of 4-gram overlaps.
Therefore the BLEU score evaluates to 0, independently of
how many N-gram overlaps of lower order it contains.
Consider using lower n-gram order or use SmoothingFunction()
[...]
The hypothesis contains 0 counts of 2-gram overlaps.
[...]
The hypothesis contains 0 counts of 3-gram overlaps.
[...]
701 examples evaluated
Top 1 Template Acc = 0.000
Top 1 Command Acc = 0.000
Average top 1 Template Match Score = 0.066
Average top 1 BLEU Score = 0.236
Top 3 Template Acc = 0.001
Top 3 Command Acc = 0.000
Average top 3 Template Match Score = 0.156
Average top 3 BLEU Score = 0.335
Corpus BLEU = 0.035

uk-ci-github avatar Apr 03 '20 07:04 uk-ci-github

bash-char.sh crashes at the end of the 1st epoch with this output

100%|█████████████████████████████████████| 4000/4000 [4:32:36<00:00,  4.09s/it]
Training loss = nan is too large.
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 378, in <module>
    tf.compat.v1.app.run()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 353, in main
    train(train_set, dataset)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 111, in train
    raise graph_utils.InfPerplexityError
encoder_decoder.graph_utils.InfPerplexityError

uk-ci-github avatar Apr 03 '20 07:04 uk-ci-github

bash-copy-partial-token.sh crashes even before starting the training with this output

Bashlint grammar set up (124 utilities)

Reading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
Saving models to /root/workspace/etsiCuts/nl2bash/encoder_decoder/../model/seq2seq
Loading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
source file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.filtered
target file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.filtered
9985 data points read.
[...]
Loading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
source vocabulary size = 1570
target vocabulary size = 1214
max source token size = 19
max target token size = 40
source file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.filtered
target file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.filtered
source tokenized sequence file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.partial.token
target tokenized sequence file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.partial.token
9985 data points read.
max_source_length = 181
max_target_length = 205
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/translate.py", line 378, in <module>
    tf.compat.v1.app.run()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/translate.py", line 320, in main
    train_set, dev_set, test_set = data_utils.load_data(FLAGS, use_buckets=True)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 137, in load_data
    use_buckets=use_buckets, add_start_token=True, add_end_token=True)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 251, in read_data
    sc_copy_tokens, tg_copy_tokens, vocab.tg_vocab, token_ext)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 738, in compute_copy_indices
    assert(len(sc_tokens) == len(sc_copy_tokens))
AssertionError
Makefile:41: recipe for target 'train' failed

uk-ci-github avatar Apr 03 '20 07:04 uk-ci-github

I met the same problem, any idea how to fix this?

bash-copy-partial-token.sh crashes even before starting the training with this output

Bashlint grammar set up (124 utilities)

Reading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
Saving models to /root/workspace/etsiCuts/nl2bash/encoder_decoder/../model/seq2seq
Loading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
source file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.filtered
target file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.filtered
9985 data points read.
[...]
Loading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
source vocabulary size = 1570
target vocabulary size = 1214
max source token size = 19
max target token size = 40
source file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.filtered
target file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.filtered
source tokenized sequence file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.partial.token
target tokenized sequence file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.partial.token
9985 data points read.
max_source_length = 181
max_target_length = 205
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/translate.py", line 378, in <module>
    tf.compat.v1.app.run()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/translate.py", line 320, in main
    train_set, dev_set, test_set = data_utils.load_data(FLAGS, use_buckets=True)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 137, in load_data
    use_buckets=use_buckets, add_start_token=True, add_end_token=True)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 251, in read_data
    sc_copy_tokens, tg_copy_tokens, vocab.tg_vocab, token_ext)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 738, in compute_copy_indices
    assert(len(sc_tokens) == len(sc_copy_tokens))
AssertionError
Makefile:41: recipe for target 'train' failed

QuinVIVER avatar Oct 10 '21 02:10 QuinVIVER

bash-char.sh crashes at the end of the 1st epoch with this output

100%|█████████████████████████████████████| 4000/4000 [4:32:36<00:00,  4.09s/it]
Training loss = nan is too large.
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 378, in <module>
    tf.compat.v1.app.run()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 353, in main
    train(train_set, dataset)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 111, in train
    raise graph_utils.InfPerplexityError
encoder_decoder.graph_utils.InfPerplexityError

met this too.

QuinVIVER avatar Oct 11 '21 03:10 QuinVIVER

Hello, have you solve this problem(AssertionError)? And I tried many times but failed. Please give me your solutions. Thank you very much!

NingYueran avatar Jun 26 '22 10:06 NingYueran