cnn-text-classification-tf icon indicating copy to clipboard operation
cnn-text-classification-tf copied to clipboard

More than two labels.

Open johnp2266 opened this issue 7 years ago • 2 comments

Hi,

I'm new to this. I have 7 different types of data for my dataset. I'm not sure what to change in eval.py to load my dataset?

johnp2266 avatar Jul 02 '18 19:07 johnp2266

Take a look at the file data_helpers.py There is a method: load_data_and_labels(positive_data_file, negative_data_file), which returns the data and labels.

You can add a new method, where you read in your own data. Then turn your labels into 1-hot encoded form and return it. In eval.py change code to call your own new method.

Something like

 def load_multilabel_data_and_labels():
    # add here code to read in your own data
    
    # Turn your labels into 1-hot-encoded form
    # Here 5 classes, so we return a an integer vector length 5. [1,0,0,0,0]
    # where only one of them is 1, rest 0. 
    # Meaning [class1=1, class2=0, class3=0...]
    
    nr_classes = enumerate(labels)
    nr_lines = len(labels)    
    new_labels = np.zeros(nr_lines,nr_classes)) 
        
    for labelnr, value in enumerate(labels):
        if value[0]==1:
            new_labels[labelnr][0]=1  #one hot to true

        elif value[0]==0.7:
            new_labels[labelnr][1]=1  

        elif value[0]==0.5:
            new_labels[labelnr][2]=1  

        elif value[1]==0.7:
            new_labels[labelnr][3]=1  

        elif value[0]==0:
            new_labels[labelnr][4]=1  

    x_text = new_texts
    y = new_labels
    return [x_text, y]

In eval.py, find this and replace with your new method:

x_text, y = data_helpers.load_data_and_labels(FLAGS.positive_data_file, FLAGS.negative_data_file)

jannenev avatar Jul 11 '18 08:07 jannenev

Is this enough ? Do I need to change the definition of the model?

applepieiris avatar Jun 29 '21 03:06 applepieiris