bert-toxic-comments-multilabel icon indicating copy to clipboard operation
bert-toxic-comments-multilabel copied to clipboard

Wrong number of labels

Open pablogps opened this issue 5 years ago • 0 comments

The function get_labels is used to get the labels from the source csv files, and the length of this is used to get the size of the last layer in the model. However, the first column is the ID of the document, not a label, so using this results in a size mismatch in the model, which is unable to train.

if self.labels == None: self.labels = list(pd.read_csv(os.path.join(self.data_dir, "classes.txt"),header=None)[0].values)

Removing the first value (or saying num_labels = len(labels - 1)) fixes this problem.

pablogps avatar Apr 09 '19 15:04 pablogps