thot icon indicating copy to clipboard operation
thot copied to clipboard

thot_tm_train -s ${src_train_corpus} -t ${trg_train_corpus} -o tm_outdir error

Open buubuu opened this issue 8 years ago • 34 comments

hello, i am currently having issues with thot_tm_train -s ${src_train_corpus} -t ${trg_train_corpus} -o tm_outdir
It keeps giving me this cat: /home/oluwasegun/thot_pbs_gen_batch_sw_model_sdir_3814_3821/models_per_chunk/__proc_n2.log: No such file or directory cat: /home/oluwasegun/thot_pbs_gen_batch_sw_model_sdir_3814_3821/models_per_chunk/__proc_n3.log: No such file or directory cat: /home/oluwasegun/thot_pbs_gen_batch_sw_model_sdir_3814_3821/models_per_chunk/__proc_n4.log: No such file or directory cat: /home/oluwasegun/thot_pbs_gen_batch_sw_model_sdir_3814_3821/models_per_chunk/__proc_n5.log: No such file or directory cat: /home/oluwasegun/thot_pbs_gen_batch_sw_model_sdir_3814_3821/curr_tables/generate_final_model.log: No such file or directory Error during the execution of thot_pbs_gen_batch_sw_model (proc_chunk) File /home/oluwasegun/tm_outdir/main/src_trg_swm.genswm_err contains information for error diagnosing

Any help? Thanks ;)

buubuu avatar Jun 23 '16 01:06 buubuu

Hi,

could you tell me what happens if after compiling and installing the package you execute: "make installcheck" ?

daormar avatar Jun 23 '16 09:06 daormar

Hello, Thanks for reaching me. Can't remember but it gave an error for a particular file not sure I remember the name. Does it have any correlation as to why the thot_tm_train isn't working?

On 10:28AM, Thu, 23 Jun 2016 Daniel Ortiz-Martínez, < [email protected]> wrote:

Hi,

could you tell me what happens if after compiling and installing the package you execute: "make installcheck" ?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-227996814, or mute the thread https://github.com/notifications/unsubscribe/AJVnzb5FI5vKN9X4dj_KKtJFg8heJIxGks5qOlGpgaJpZM4I8YcO .

Warm Regards, Omotosho Busayomi WTM Lead Akure

buubuu avatar Jun 23 '16 09:06 buubuu

Could you tell me which operating system are you using?

Additionally, could you please paste the output you get when executing echo $0 ?

daormar avatar Jun 23 '16 10:06 daormar

I'm using Ubuntu and the output is bash.

On 11:23AM, Thu, 23 Jun 2016 Daniel Ortiz-Martínez, < [email protected]> wrote:

Could you tell me which operating system are you using?

Additionally, could you please paste the output you get when executing echo $0 ?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228009560, or mute the thread https://github.com/notifications/unsubscribe/AJVnzW6DSPwLz93ItHEzk_hR_SyscLW5ks5qOl6SgaJpZM4I8YcO .

Warm Regards, Omotosho Busayomi WTM Lead Akure

buubuu avatar Jun 23 '16 16:06 buubuu

Ok thanks, could you please execute the following commands and attach the debug.log file that is created?

echo "hello" > a echo "hola" > b bash -x ${THOT_INSTALLATION_DIR}/bin/thot_pbs_gen_batch_sw_model -s a -t b -o test -n 1 -pr 1 2> debug.log

daormar avatar Jun 24 '16 11:06 daormar

Attached is the debug file.

Warm Regards, Busayomi Omotosho On 24 Jun 2016 12:13, "Daniel Ortiz-Martínez" [email protected] wrote:

Ok thanks, could you please execute the following commands and attach the debug.log file that is created?

echo "hello" > a echo "hola" > b bash -x ${THOT_INSTALLATION_DIR}/bin/thot_pbs_gen_batch_sw_model -s a -t b -o test -n 1 -pr 1 2> debug.log

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228319632, or mute the thread https://github.com/notifications/unsubscribe/AJVnzZs6jA9uvJ7AH0cvdu1v0LnIJEVAks5qO7vhgaJpZM4I8YcO .

buubuu avatar Jun 24 '16 22:06 buubuu

debug.log https://drive.google.com/file/d/0B1FPkCwCdY6PUllUbk12NWcwdzlwbG4xSDNqRXZya1RDaFMw/view?usp=drivesdk

Warm Regards, Busayomi Omotosho On 24 Jun 2016 12:13, "Daniel Ortiz-Martínez" [email protected] wrote:

Ok thanks, could you please execute the following commands and attach the debug.log file that is created?

echo "hello" > a echo "hola" > b bash -x ${THOT_INSTALLATION_DIR}/bin/thot_pbs_gen_batch_sw_model -s a -t b -o test -n 1 -pr 1 2> debug.log

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228319632, or mute the thread https://github.com/notifications/unsubscribe/AJVnzZs6jA9uvJ7AH0cvdu1v0LnIJEVAks5qO7vhgaJpZM4I8YcO .

buubuu avatar Jun 24 '16 22:06 buubuu

Hi again,

could you please follow these instructions and paste the results? thanks.

  1. Go to the directory where you installed Thot.
  2. Execute the following:

echo "hello" > a echo "hola" > b bin/thot_gen_sw_model -s a -t b -o test -n 5 -eb

NOTE: ensure that bin/thot_gen_sw_model executes the thot_gen_sw_model command (the output should not be that the command was not found).

daormar avatar Jun 27 '16 08:06 daormar

Hello again too, Output is

bash: bin/thot_gen_sw_model: No such file or directory

Warm Regards, Busayomi Omotosho On 27 Jun 2016 09:13, "Daniel Ortiz-Martínez" [email protected] wrote:

Hi again,

could you please follow these instructions and paste the results? thanks.

  1. Go to the directory where you installed Thot.
  2. Execute the following:

echo "hello" > a echo "hola" > b bin/thot_gen_sw_model -s a -t b -o test -n 5 -eb

NOTE: ensure that bin/thot_gen_sw_model executes the thot_gen_sw_model command (the output should not be that the command was not found).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228680630, or mute the thread https://github.com/notifications/unsubscribe/AJVnzQueXOstmVUy5cCnfF0G32zvvAQsks5qP4YSgaJpZM4I8YcO .

buubuu avatar Jun 27 '16 09:06 buubuu

Hello sorry for the first one. Here is the command with modification.

thot_gen_sw_model -s a -t b -o test -n 5 -eb

And output:

Reading dynamic class information file: /usr/local/share/thot/ini_files/master.ini Found entry for class BaseWordPenaltyModel, so file: /usr/local/lib/word_penalty_model_factory.so, init parameters: Found entry for class BaseNgramLM, so file: /usr/local/lib/incr_jel_mer_ngram_lm_factory.so, init parameters: Found entry for class BaseSwAligModel, so file: /usr/local/lib/incr_hmm_p0_alig_model_factory.so, init parameters: Found entry for class BasePhraseModel, so file: /usr/local/lib/wba_incr_phrase_model_factory.so, init parameters: Found entry for class BaseErrorCorrectionModel, so file: /usr/local/lib/pfsm_ecm_for_wg_factory.so, init parameters: Found entry for class BaseEcModelForNbUcat, so file: /usr/local/lib/non_pb_ec_model_for_nb_ucat_factory.so, init parameters: Found entry for class BaseWgProcessorForAnlp, so file: /usr/local/lib/wg_processor_for_anlp__pfsm_factory.so, init parameters: Found entry for class BaseScorer, so file: /usr/local/lib/mira_bleu_factory.so, init parameters: Found entry for class BaseLogLinWeightUpdater, so file: /usr/local/lib/kb_mira_ll_wu_factory.so, init parameters: Found entry for class BaseTranslationConstraints, so file: /usr/local/lib/translation_constraints_factory.so, init parameters: Found entry for class BaseStackDecoder, so file: /usr/local/lib/multi_stack_decoder_rec__swli_factory.so, init parameters: Found entry for class BaseAssistedTrans, so file: /usr/local/lib/wg_uncoupled_assisted_trans__swli_factory.so, init parameters: Opening module /usr/local/lib/incr_hmm_p0_alig_model_factory.so ... Done, typeid: IncrHmmP0AligModel Initializing sentence handler... Reading sentence pairs from files: a and b #Sentence pairs in files: 1 Starting EM iterations... Iter: 1 , log-likelihood= -0.124529 , norm-ll= -0.124529 Iter: 2 , log-likelihood= -0.0779615 , norm-ll= -0.0779615 Iter: 3 , log-likelihood= -0.0779615 , norm-ll= -0.0779615 Iter: 4 , log-likelihood= -0.0779615 , norm-ll= -0.0779615 Iter: 5 , log-likelihood= -0.0779615 , norm-ll= -0.0779615 Closing module /usr/local/lib/incr_hmm_p0_alig_model_factory.so oluwasegun@oluwasegun:~/thot$

Warm Regards, Busayomi Omotosho On 27 Jun 2016 10:04, "omotosho busayomi" [email protected] wrote:

Hello again too, Output is

bash: bin/thot_gen_sw_model: No such file or directory

Warm Regards, Busayomi Omotosho On 27 Jun 2016 09:13, "Daniel Ortiz-Martínez" [email protected] wrote:

Hi again,

could you please follow these instructions and paste the results? thanks.

  1. Go to the directory where you installed Thot.
  2. Execute the following:

echo "hello" > a echo "hola" > b bin/thot_gen_sw_model -s a -t b -o test -n 5 -eb

NOTE: ensure that bin/thot_gen_sw_model executes the thot_gen_sw_model command (the output should not be that the command was not found).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228680630, or mute the thread https://github.com/notifications/unsubscribe/AJVnzQueXOstmVUy5cCnfF0G32zvvAQsks5qP4YSgaJpZM4I8YcO .

buubuu avatar Jun 27 '16 09:06 buubuu

Hi again,

it seems that this tool is working correctly, so I need to return to the previous check. Could you please generate again the debug.log file that I mentioned in a previous post? Use the following instructions:

  1. Go to the directory where you installed Thot.
  2. Execute the following:

bash -x bin/thot_pbs_gen_batch_sw_model -s share/thot/toy_corpus/sp_tok_lc.train -t share/thot/toy_corpus/en_tok_lc.train -o test -n 1 -pr 1 2> debug.log

daormar avatar Jun 27 '16 09:06 daormar

Here is the output in the debug.log file generated:

NOTE: see file /home/oluwasegun/thot_pbs_gen_batch_sw_model_sdir_2285_4470/log to track model estimation progress

Warm Regards, Busayomi Omotosho On 27 Jun 2016 10:33, "Daniel Ortiz-Martínez" [email protected] wrote:

Hi again,

it seems that this tool is working correctly, so I need to return to the previous check. Could you please generate again the debug.log file that I mentioned in a previous post? I repeat the instructions below:

  1. Go to the directory where you installed Thot.
  2. Execute the following:

echo "hello" > a echo "hola" > b bin/thot_pbs_gen_batch_sw_model -s a -t b -o test -n 1 -pr 1 2> debug.log

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228697385, or mute the thread https://github.com/notifications/unsubscribe/AJVnzQ1wVraNPaIP2l6l2Qu1Lib3-PBuks5qP5jbgaJpZM4I8YcO .

buubuu avatar Jun 27 '16 09:06 buubuu

Sorry, I have edited the command, since it appears to be working as well, I have redefined it and added "bash -x" at the beginning. Could you please generate the file again? Thanks

daormar avatar Jun 27 '16 09:06 daormar

This seems to be working as well. Perhaps the problem has something to do with the texts you are using to train the systems. Could you please execute the previous command but using your files (since debug.log will be a large file, probably it is easier if you include it as an attachment):

bash -x bin/thot_pbs_gen_batch_sw_model -s ${src_train_corpus} -t ${trg_train_corpus} -o test -n 1 -pr 1 2> debug.log

NOTE: ${src_train_corpus} and ${trg_train_corpus} represent the files you initially used to train the models

daormar avatar Jun 27 '16 10:06 daormar

debug.log https://drive.google.com/file/d/0B1FPkCwCdY6POUdvRDZjandzamZIM3NMSF9GazE3dUFQOHdj/view?usp=drivesdk

Warm Regards, Busayomi Omotosho On 27 Jun 2016 11:42, "Daniel Ortiz-Martínez" [email protected] wrote:

This seems to be working as well. Perhaps the problem has something to do with the texts you are using to train the systems. Could you please execute the previous command but using your files (since debug.log will be a large file, probably it is easier if you include it as an attachment):

bash -x bin/thot_pbs_gen_batch_sw_model -s ${src_train_corpus} -t ${trg_train_corpus} -o test -n 1 -pr 1 2> debug.log

NOTE: ${src_train_corpus} and ${trg_train_corpus} represent the files you initially used to train the models

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228712007, or mute the thread https://github.com/notifications/unsubscribe/AJVnzcbWqDZHhM9B_8VupN5-TsH-UBwhks5qP6krgaJpZM4I8YcO .

buubuu avatar Jun 27 '16 11:06 buubuu

Thanks again, so it appears to be failing with your training files. Could you please execute this very similar command and attach again the debug.log file?

bash -x bin/thot_gen_batch_sw_model_mr -s ${src_train_corpus} -t ${trg_train_corpus} -o test -n 1 2> debug.log

daormar avatar Jun 27 '16 11:06 daormar

debug.log https://drive.google.com/file/d/0B1FPkCwCdY6PUFYwSFgxZ0dCS3lvMC1ORll0eFE0ajBJQUZJ/view?usp=drivesdk

Warm Regards, Busayomi Omotosho On 27 Jun 2016 12:21, "Daniel Ortiz-Martínez" [email protected] wrote:

Thanks again, so it appears to be failing with your training files. Could you please execute this very similar command and attach again the debug.log file?

bash -x bin/thot_gen_batch_sw_model_mr -s ${src_train_corpus} -t ${trg_train_corpus} -o test -n 1 2> debug.log

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228719196, or mute the thread https://github.com/notifications/unsubscribe/AJVnzYwF0421QqqvY3jguHjlTn-K_AS-ks5qP7JSgaJpZM4I8YcO .

buubuu avatar Jun 27 '16 12:06 buubuu

Thanks, could it be possible to share with me the files you are using to train the system?

Otherwise, I can continue looking for the cause of the problem without the files. I suspect that the tool thot_gen_sw_model is generating a segmentation fault with your training files. To know the exact cause I would need to see the output of a tool called valgrind that you can install with apt. I could give you the details in a new post. Please tell me what is your preferred option.

daormar avatar Jun 27 '16 14:06 daormar

Which option is faster and easier?

On 3:03PM, Mon, 27 Jun 2016 Daniel Ortiz-Martínez, [email protected] wrote:

Thanks, could it be possible to share with me the files you are using to train the system?

Otherwise, I can continue looking for the cause of the problem without the files. I suspect that the tool thot_gen_sw_model is generating a segmentation fault with your training files. To know the exact cause I would need to see the output of a tool called valgrind that you can install with apt. I could give you the details in a new post. Please tell me what is your preferred option.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228754512, or mute the thread https://github.com/notifications/unsubscribe/AJVnzXzr0TmwbxQDY2EsWflS8ESGkYzfks5qP9gKgaJpZM4I8YcO .

Warm Regards, Omotosho Busayomi WTM Lead Akure

buubuu avatar Jun 27 '16 14:06 buubuu

If using your training files I am able to reproduce the same error you are getting, then this option is by far much faster.

daormar avatar Jun 27 '16 14:06 daormar

Ok then. Attached are the training files.

Thanks!

On Mon, Jun 27, 2016 at 10:15 AM Daniel Ortiz-Martínez < [email protected]> wrote:

If using your training files I am able to reproduce the same error you are getting, then this option is by far much faster.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228757944, or mute the thread https://github.com/notifications/unsubscribe/AJVnzcTw1Pl9xhOMkZlGLEaq7L6LtErEks5qP9r-gaJpZM4I8YcO .

Warm Regards, Omotosho Busayomi WTM Lead Akure

buubuu avatar Jun 27 '16 14:06 buubuu

Hello again, thanks for the support yesterday, just checking in to know if there's any update.

Thanks again :)

Warm Regards, Busayomi Omotosho On 27 Jun 2016 20:34, "omotosho busayomi" [email protected] wrote:

Ok then. Attached are the training files.

Thanks!

On Mon, Jun 27, 2016 at 10:15 AM Daniel Ortiz-Martínez < [email protected]> wrote:

If using your training files I am able to reproduce the same error you are getting, then this option is by far much faster.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228757944, or mute the thread https://github.com/notifications/unsubscribe/AJVnzcTw1Pl9xhOMkZlGLEaq7L6LtErEks5qP9r-gaJpZM4I8YcO .

Warm Regards, Omotosho Busayomi WTM Lead Akure

buubuu avatar Jun 28 '16 07:06 buubuu

Hi again,

the problem is that I cannot find the files, could you make sure that they are really attached?

daormar avatar Jun 28 '16 07:06 daormar

Ohk... I'll attach them again. The name of the file is toy_corpus.zip

toy_corpus.zip https://drive.google.com/file/d/0B1FPkCwCdY6PRnFWTUtTeW1OUlk1WjFoblRrYTJJRWl1WGVJ/view?usp=drivesdk

Hi, again,

the problem is that I cannot find the files, could you make sure that they are really attached?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-228974047, or mute the thread https://github.com/notifications/unsubscribe/AJVnza77XBgdsXnfHbBCCzkBwlkFYGpzks5qQM6zgaJpZM4I8YcO .

buubuu avatar Jun 28 '16 07:06 buubuu

Hi,

it seems that the problem is the length of the training sentences. In your .train files I see very long ones (for instance, there is one English sentence composed of 491 words). Having these very long sentences is problematic, since the memory requirements of model estimation increase extraordinarily.

It is very important to clean the training corpus before estimating the models (it seems that you have cleaned only the development and test corpus). The cleaning tool included in Thot by default filters all the sentences with more than 80 words. In your corpus, a 99% of the sentences has this length or below, so the cleaning process discards a very small portion of the training set, and at the same time allows to greatly reduce the computational requirements of the training process.

Could you please try to clean the training corpus and see if now it works?

daormar avatar Jun 28 '16 10:06 daormar

Hello again, It worked! :) Thanks! But then this command

thot_prepare_sys_for_test -c tune/tuned_for_dev.cfg -t ${src_test_corpus} -o systest

Gives this error:

Error! file tune/tuned_for_dev.cfg does not exist

Warm Regards, Busayomi Omotosho

Hi,

it seems that the problem is the length of the training sentences. In your .train files I see very long ones (for instance, there is one English sentence composed of 491 words). Having this very long sentences is problematic, since the memory requirements of model estimation increase extraordinarily.

It is very important to clean the training corpus before estimating the models (it seems that you have cleaned only the development and test corpus). The cleaning tool included in Thot by default filters all the sentences with more than 80 words. In your corpus, a 99% of the sentences has this length or below, so the cleaning process discards a very small portion of the training set, and at the same time allows to greatly reduce the computational requirements of the training process.

Could you please try to clean the training corpus and see if now it works?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-229008951, or mute the thread https://github.com/notifications/unsubscribe/AJVnzYhqVNbSdcq2xDWRw2vzZ_2OoiPEks5qQPP3gaJpZM4I8YcO .

buubuu avatar Jun 28 '16 23:06 buubuu

Hi again,

could you show me the command line that you executed before thot_prepare_sys_for_test? (the command is thot_smt_tune, but I would like to see the parameters you used). Do you receive any error message when you execute such command line?

daormar avatar Jun 29 '16 08:06 daormar

Hello again, Attached is the command line after rerunning the thot_smt_time and then the thot_prepare_sys_for_test.

Warm Regards, Busayomi Omotosho On 29 Jun 2016 09:05, "Daniel Ortiz-Martínez" [email protected] wrote:

Hi again,

could you show me the command line that you executed before thot_prepare_sys_for_test? (the command is thot_smt_tune, but I would like to see the parameters you used). Do you receive any error message when you execute such command line?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-229285558, or mute the thread https://github.com/notifications/unsubscribe/AJVnzXkSQnuzcVtGe6LpPgeCYVm-Ii3Vks5qQic2gaJpZM4I8YcO .

buubuu avatar Jun 29 '16 14:06 buubuu

Hi, again I cannot find the files, could you please ensure they are attached?

daormar avatar Jun 29 '16 17:06 daormar

Hello, sorry about that. Would do so now.

On 6:07PM, Wed, 29 Jun 2016 Daniel Ortiz-Martínez, [email protected] wrote:

Hi, again I cannot find the files, could you please ensure they are attached?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/daormar/thot/issues/2#issuecomment-229422664, or mute the thread https://github.com/notifications/unsubscribe/AJVnzeRRZMokwfi4QHj6jD8vo339m6Zyks5qQqZigaJpZM4I8YcO .

Warm Regards, Omotosho Busayomi WTM Lead Akure

buubuu avatar Jun 29 '16 19:06 buubuu