neural-template-gen icon indicating copy to clipboard operation
neural-template-gen copied to clipboard

Where is the "train_tgt_lines.txt"?

Open happycjksh opened this issue 3 years ago • 4 comments

Please help me, thank you

happycjksh avatar Jun 29 '21 11:06 happycjksh

In the make_e2e_labedata.py, you load the train_tgt_lines.txt, but I don't find the document anywhere. What should I do about it?

happycjksh avatar Jun 29 '21 14:06 happycjksh

@swiseman Hello! I have the same problem.

I want to create original e2e training data and in the process I found that I need make_e2e_labedata.py.

So I ran it as cd data && python make_e2e_labedata.py "train" and found that the file train_tgt_lines.txt was missing.

https://github.com/harvardnlp/neural-template-gen/blob/master/data/make_e2e_labedata.py#L6-L9

I cannot find the following information about this train_tgt_lines.txt.

  • What are the contents of the file?
  • Is there a train_tgt_lines.txt which is the original of the already existing src_train.txt?

If @swiseman has any info on this, please let me know!

Best regards,

p0x0q avatar Jul 21 '22 06:07 p0x0q

These are just the reference generations for the training set; each line of train_tgt_lines.txt has the reference generation for the corresponding line in src_train.txt.

swiseman avatar Jul 21 '22 14:07 swiseman

@swiseman Thanks for the reply! I thought about it for a bit, now I understand.

For the benefit of others, here is the information for each file

Example of src_train.txt

__start_name__ The Vaults __end_name__ __start_eatType__ pub __end_eatType__ __start_priceRange__ more than £ 30 __end_priceRange__ __start_customerrating__ 5 out of 5 __end_customerrating__ __start_near__ Café Adriatic __end_near__

Example of train_tgt_lines.txt

The Vaults pub near Café Adriatic has a 5 star rating . Prices start at £ 30 .

Result of running cd data && python make_e2e_labedata.py "train".

The Vaults pub near Café Adriatic has a 5 star rating . Prices start at £ 30 . <eos>|||0,2,0 2,3,1 4,6,6 11,12,7 17,18,7 18,19,8

And the result of running make_e2e_labedata.py can be used as train.txt.

Thanks for the answer! Very helpful!

p0x0q avatar Jul 21 '22 20:07 p0x0q