DeepPavlov icon indicating copy to clipboard operation
DeepPavlov copied to clipboard

Refactor Bert Classifier to output int classes instead of numpy.Int64

Open acriptis opened this issue 6 years ago • 1 comments

I have a code with a custom config for Bert Paraphraser:

DATA_PATH = BASE_PATH + "data/"

para_config = {
  "dataset_reader": {
    "class_name": "csv_reader",
    "data_path": DATA_PATH,
    "do_lower_case": False,
    "delimiter": ",",
    "train_ds_fname": "micro_train.csv",
    "valid_ds_fname": "micro_valid.csv",
    "test_ds_fname": "micro_test.csv"
  },
  "dataset_iterator": {
    "class_name": "siamese_iterator",
    "seed": 243,
    "len_valid": 500
  },
  "chainer": {
    "in": ["text_a", "text_b"],
    "in_y": ["y"],
    "pipe": [
      {
        "class_name": "bert_preprocessor",
        "vocab_file": "{DOWNLOADS_PATH}/bert_models/rubert_cased_L-12_H-768_A-12_v1/vocab.txt",
        "do_lower_case": False,
        "max_seq_length": 160,
        "in": ["text_a", "text_b"],
        "out": ["bert_features"]
      },
      {
        "class_name": "bert_classifier",
        "n_classes": 2,
        "one_hot_labels": False,
        "bert_config_file": "{DOWNLOADS_PATH}/bert_models/rubert_cased_L-12_H-768_A-12_v1/bert_config.json",
        "pretrained_bert": "{DOWNLOADS_PATH}/bert_models/rubert_cased_L-12_H-768_A-12_v1/bert_model.ckpt",
        "save_path": "{MODELS_PATH}/paraphraser_rubert/model_rubert",
        "load_path": "{MODELS_PATH}/paraphraser_rubert/model_rubert",
        "keep_prob": 0.5,
        "optimizer": "tf.train:AdamOptimizer",
        "learning_rate": 2e-05,
        "learning_rate_drop_patience": 3,
        "learning_rate_drop_div": 2.0,
        "in": ["bert_features"],
        "in_y": ["y"],
        "out": ["predictions"]
      }
    ],
    "out": ["predictions"]
  },
  "train": {
    "batch_size": 32,
    "train_metrics": ["acc"],
    "metrics": ["acc"],
    "validation_patience": 7,
    "val_every_n_batches": 2,
    "log_every_n_batches": 1,
    
    "tensorboard_log_dir": "{MODELS_PATH}/paraphraser_rubert/logs",
    "show_examples": True,
  },
  "metadata": {
    "variables": {
      "ROOT_PATH": "~/.deeppavlov",
      "DOWNLOADS_PATH": "{ROOT_PATH}/downloads",
      "MODELS_PATH": "{ROOT_PATH}/models",
      "DATA_PATH": DATA_PATH
    },
    "requirements": [
      "{DEEPPAVLOV_PATH}/requirements/tf.txt",
      "{DEEPPAVLOV_PATH}/requirements/bert_dp.txt"
    ],
    "download": [      
      {
        "url": "http://files.deeppavlov.ai/deeppavlov_data/bert/rubert_cased_L-12_H-768_A-12_v1.tar.gz",
        "subdir": "{DOWNLOADS_PATH}/bert_models"
      },
      {
        "url": "http://files.deeppavlov.ai/deeppavlov_data/classifiers/paraphraser_rubert_v0.tar.gz",
        "subdir": "{ROOT_PATH}/models"
      }
    ]
  }
}

paraphraser_model = train_model(para_config)

The problem with this code is that it fails with error:

Traceback (most recent call last):
  File "runparaphraser_train.py", line 100, in <module>
    paraphraser_model = train_model(para_config)
  File "/home/alx/Cloud/dns/.venv3/lib/python3.6/site-packages/deeppavlov-0.4.0-py3.6.egg/deeppavlov/__init__.py", line 31, in train_model
    train_evaluate_model_from_config(config, download=download, recursive=recursive)
  File "/home/alx/Cloud/dns/.venv3/lib/python3.6/site-packages/deeppavlov-0.4.0-py3.6.egg/deeppavlov/core/commands/train.py", line 121, in train_evaluate_model_from_config
    trainer.train(iterator)
  File "/home/alx/Cloud/dns/.venv3/lib/python3.6/site-packages/deeppavlov-0.4.0-py3.6.egg/deeppavlov/core/trainers/nn_trainer.py", line 294, in train
    self.train_on_batches(iterator)
  File "/home/alx/Cloud/dns/.venv3/lib/python3.6/site-packages/deeppavlov-0.4.0-py3.6.egg/deeppavlov/core/trainers/nn_trainer.py", line 234, in train_on_batches
    self._validate(iterator)
  File "/home/alx/Cloud/dns/.venv3/lib/python3.6/site-packages/deeppavlov-0.4.0-py3.6.egg/deeppavlov/core/trainers/nn_trainer.py", line 178, in _validate
    print(json.dumps(report, ensure_ascii=False))
  File "/usr/lib/python3.6/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/usr/lib/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python3.6/json/encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'int64' is not JSON serializable

The code fails because BertClassifier returns numpy.int64 elements as predictions (https://github.com/deepmipt/DeepPavlov/blob/master/deeppavlov/models/bert/bert_classifier.py#L242), while in evaluation the code constructs a report as json and then prints it into output....

@mu-arkhipov helped me to resolve the problem in my case by adding following lines to BertClassifierModel.call just before return:

        if pred.ndim == 1:
            pred = [int(p) for p in pred]

this resolves the problem for my case only. Could anybody offer more general solution? @dilyararimovna

acriptis avatar Jul 19 '19 12:07 acriptis

Yes, we might cast reports outputs to default python types in nn_trainer. @yoptar

yurakuratov avatar May 14 '20 09:05 yurakuratov

@acriptis, sorry for the late response. We tested this issue on DeepPavlov version 1.2.0 and didn’t encounter the described problem. If you are still having difficulties with this, please let us know.

Kolpnick avatar Jul 06 '23 13:07 Kolpnick