TabFormer icon indicating copy to clipboard operation
TabFormer copied to clipboard

How to use Model to get transaction embeddings?

Open bjchaudhari29 opened this issue 4 years ago • 13 comments

Hi Team, Thanx a lot for sharing the code. I was able to train Bert model on card dataset but I am facing issue while loading saved model to generate embeddings. Can you please let me know the way to load model weights and way to generate embedding for transaction.

After creating instance of class TabFormerBertLM I am trying to load weights by following command.

tab_net.from_pretrained('/content/drive/MyDrive/TabFormer/checkpoint-500/pytorch_model.bin')

After running this I am getting the following error. AttributeError: 'TabFormerBertLM' object has no attribute 'from_pretrained'

It will be very helpful if you can guide me to solve this problem.

Thank you.

bjchaudhari29 avatar Jul 08 '21 10:07 bjchaudhari29

Hey, did you solve the problem?

sevstafiev avatar Jul 23 '21 15:07 sevstafiev

Not yet.

bjchaudhari29 avatar Jul 24 '21 09:07 bjchaudhari29

Hi @bjchaudhari29 / @sevstafiev :

Apologies for not getting back on this earlier.

I know why you are seeing the issue is! I will try to get back on this later this week, when I get time, and share some code snippet to show how to load it properly. However, if you can't wait until then, please look at the branch gpt_cc_user_eval which does exactly same thing that you are attempting to but on GPT model.

ink-pad avatar Jul 27 '21 14:07 ink-pad

It would be amazing if you share code snippet for bert model! Thanks.

sevstafiev avatar Jul 28 '21 15:07 sevstafiev

Hi @ink-pad , Thank you for the reply however I am not able able to access gpt_cc_user_eval branch.

bjchaudhari29 avatar Jul 30 '21 05:07 bjchaudhari29

@bjchaudhari29 :

My bad - I have fixed the link now!

ink-pad avatar Jul 30 '21 17:07 ink-pad

Hi @ink-pad , Did you get time to work on loading of BERT model to get transaction level embedding. It would be great if you can share that code.

Thank you.

bjchaudhari29 avatar Aug 11 '21 05:08 bjchaudhari29

Hi , I am trying loading BERT model as loaded GPT in given link (https://github.com/IBM/TabFormer/blob/gpt_cc_user_eval/gpt_eval.py) it is giving following error.

inferencer = tab_net.model.from_pretrained('/content/drive/MyDrive/TabFormer/checkpoint-500', vocab=dataset.vocab).to(device)

File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_utils.py", line 844, in from_pretrained config, model_kwargs = cls.config_class.from_pretrained( AttributeError: 'NoneType' object has no attribute 'from_pretrained'

bjchaudhari29 avatar Aug 11 '21 07:08 bjchaudhari29

@bjchaudhari29 try this to load the model.

config = TabFormerBertConfig.from_pretrained(
    "/output/checkpoint-2000/config.json"
)
model = tab_net.model.from_pretrained("/output/checkpoint-2000/pytorch_model.bin", 
config=config ,vocab=dataset.vocab).to(device)

kekayan avatar Dec 18 '21 05:12 kekayan

Hi @ink-pad ,

Thanks for open-sourcing the code. I have a question, Do we need to change the line 118 in tabformer_bert.py, from outputs = (prediction_scores,) + outputs[2:] to outputs = (prediction_scores,) + outputs . to get the transaction/row embeddings for each row ?

kekayan avatar Dec 20 '21 12:12 kekayan

Any update on this issue?

monk1337 avatar Jan 31 '22 22:01 monk1337

Hi @ink-pad ,

Thanks for open-sourcing the code. I have a question, Do we need to change the line 118 in tabformer_bert.py, from outputs = (prediction_scores,) + outputs[2:] to outputs = (prediction_scores,) + outputs . to get the transaction/row embeddings for each row ?

Hi @kekayan ,

Have you found the correct way to get the row embeddings for each row? I want to reproduce the results on Fraud Detection Task and have the same problem.

Thanks for your help!

shaoyijia avatar Feb 03 '22 15:02 shaoyijia

Hi, I have the same query. Any update on how to obtain the row embeddings for each row? Some code level guidance would help :)

shamgane avatar Aug 23 '22 13:08 shamgane