Error when using --summarize with matcher.py
Hi,
in your readme it says that the --summarize flag needs to be specified for matcher.py if it was also specified at training time. When I do so I get the following error:
Defaults for this optimization level are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
0it [00:00, ?it/s]
Traceback (most recent call last):
File "matcher.py", line 242, in <module>
dk_injector=dk_injector)
File "matcher.py", line 149, in predict
pairs.append((to_str(row[0], summarizer, max_len, dk_injector),
File "matcher.py", line 49, in to_str
content = summarizer.transform(content, max_len=max_len)
File "/content/drive/My Drive/Master Thesis/ditto/repo/ditto/ditto/summarize.py", line 75, in transform
sentA, sentB, label = row.strip().split('\t')
ValueError: not enough values to unpack (expected 3, got 1)
Without the --summarize flag matcher.py is running fine.
Is there any workaround to use matcher.py with summarization?
I have encountered similar issue.
`Downloading: 100%|███████████████████████████| 28.0/28.0 [00:00<00:00, 21.7kB/s] Downloading: 100%|██████████████████████████████| 483/483 [00:00<00:00, 401kB/s] Downloading: 100%|███████████████████████████| 226k/226k [00:00<00:00, 38.4MB/s] Downloading: 100%|███████████████████████████| 455k/455k [00:00<00:00, 40.8MB/s] Downloading: 100%|███████████████████████████| 256M/256M [00:05<00:00, 49.4MB/s] Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertModel: ['vocab_layer_norm.weight', 'vocab_transform.bias', 'vocab_projector.bias', 'vocab_layer_norm.bias', 'vocab_projector.weight', 'vocab_transform.weight']
- This IS expected if you are initializing DistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Selected optimization level O2: FP16 training with FP32 batchnorm and FP32 master weights.
Defaults for this optimization level are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ModuleNotFoundError("No module named 'amp_C'",)
/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/apex/amp/_initialize.py:25: UserWarning: An input tensor was not cuda.
warnings.warn("An input tensor was not cuda.")
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 32768.0
/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
step: 0, loss: 0.7943093180656433
Traceback (most recent call last):
File "train_ditto.py", line 92, in