ditto
ditto copied to clipboard
Code for the paper "Deep Entity Matching with Pre-trained Language Models"
training
I was trying to execute the training code on a cpu. With the following hyperparemeters. python train_ditto.py \ --task Structured/Beer \ --batch_size 64 \ --max_len 64 \ --lr 3e-5 \...
`!CUDA_VISIBLE_DEVICES=0 python train_ditto.py \ --task Textual/Company \ --batch_size 32 \ --max_len 128 \ --lr 3e-5 \ --n_epochs 20 \ --finetuning \ --lm roberta \ --fp16 \ --da drop_col` `step: 0,...
When I run the code I got three f1 from different epochs. Which f1 should we report as a final f1 accuracy based on the paper? this is the example...
When I try to use data augmentation with drop_col, I get the error below. I did not change anything about the model or data, is there something I'm missing?
Hi, in your readme it says that the --summarize flag needs to be specified for matcher.py if it was also specified at training time. When I do so I get...
How to do inferencing after running the training on unseen data?
Hello, I am trying to run the training code but I come to this error: ``` 2020-11-02 07:36:08.658676: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 Downloading: 100% 232k/232k [00:00
Hi ditto creators, I am currently working on a short Masters project, and I discovered your library. We are performing de-duplication on the Cora citations dataset using **py_entitymatching** and **deepmatcher**....
As the question, I am wondering did you also include special tokens like [COL] [VAL] and attribute names into BERT vocabulary?
Can we add the --save_model flag to the "train the matching model" example? This will allow users to know how to produce the pt checkpoints to run the matching models...