style-transfer-paraphrase Add example scripts for training CDS models

Additionally, add dict.txt files for these dataset folders

Oct 27 '21 13:10 martiansideofthemoon

Hello @martiansideofthemoon I hope you are fine and doing great. I am trying to set up the custom dataset to use in training. I have converted the text file into bpe format but now facing a dict.txt not founding error. I have attached the screenshot as well. Please have a look and let me know. Thank You. dict txt file not found

Feb 08 '23 18:02 TufailAhmadSiddiq

Hi @TufailAhmadSiddiq , could you share output of ls datasets/new_dataset/*/*?

Feb 08 '23 18:02 martiansideofthemoon

Hi, I have executed that command, and here is the result ls datasets new_dataset

Feb 08 '23 19:02 TufailAhmadSiddiq

Please follow instructions here, especially the first paragraph: https://github.com/martiansideofthemoon/style-transfer-paraphrase#custom-datasets

You need .txt, .label files to get it started. The first script will create the input0.bpe files for you.

Feb 08 '23 19:02 martiansideofthemoon

I have created .txt and .label files for training, validation, and testing and placed them inside the Uploading inside new_dataset.PNG… new_dataset. Here is the screenshot

Feb 08 '23 19:02 TufailAhmadSiddiq

inside new_dataset

Feb 08 '23 19:02 TufailAhmadSiddiq

Hi, @martiansideofthemoon I have sent you the screenshot of my directory. Can you please tell me why this problem is occurring?

Feb 09 '23 16:02 TufailAhmadSiddiq

What's the error you get with this directory in place?

Feb 09 '23 16:02 martiansideofthemoon

The following error is occurring However the file dict.txt is there in datasets/new_dataset-bin.

Feb 09 '23 17:02 HassanBinAli

I'm suspecting this error is coming from fairseq preprocessing. I think it creates the dict files for you (the entire bin folder in fact). Maybe try to run the code by temporarily renaming the bin folder to something else?

Feb 09 '23 19:02 martiansideofthemoon

The error is still intact. However, it creates a folder named "new_dataset-bin". I am attaching a screen shot of what is inside dataset/new_dataset-bin folder below There are two folders and one file. input0 folder is empty however label folder has following I am also attaching screen shot of what I have in dict.txt below I have some articles on which I am trying to fine tune this model so that the model can learn the writing style used in my articles. I gave this style the name of "custom_style". 15720 represents the entries in my train set. I think these files seem fine. So my question is can I proceed to fine tuning step with input0 having nothing in it?

Feb 11 '23 16:02 HassanBinAli

style-transfer-paraphrase style-transfer-paraphrase copied to clipboard

Add example scripts for training CDS models

style-transfer-paraphrase
style-transfer-paraphrase copied to clipboard