ai-job-title-area-classification icon indicating copy to clipboard operation
ai-job-title-area-classification copied to clipboard

regarding test data

Open GV67 opened this issue 4 years ago • 5 comments

Hi, When i use different test data, the sgd/mlp_results shows. ",,,,,,,," Could you please help with that?

GV67 avatar Oct 19 '20 10:10 GV67

Hello, can you show me your code using different data?

brunomichetti avatar Oct 19 '20 12:10 brunomichetti

Thank you for replying. I have solved this problem, i made some mistakes in the script. however, i have another question. Could it be possible to include texts like chinese, arabic into the dataset of tsv file? if yes, then could you tell me how?

GV67 avatar Oct 20 '20 11:10 GV67

And if we were to use csv file instead of tsv, how would we do it?

GV67 avatar Oct 20 '20 11:10 GV67

Hello, just change in this file data_process/tsv_to_dataset.py the line 8 with the csv file , adding the corresponding delimiter

mikaelapisani avatar Oct 22 '20 14:10 mikaelapisani

@GV67 regarding other languages. In theory it should work since the algorithm is based on the frequency terms. You should train and test the network with other dataset, changing this : data_process/data_sets/classified_titles.tsv. However, I don't know if it would work ok since our goal was not to support other languages.

mikaelapisani avatar Oct 22 '20 15:10 mikaelapisani