TabFormer icon indicating copy to clipboard operation
TabFormer copied to clipboard

data format for regression task

Open r-matsuzaka opened this issue 2 years ago • 2 comments

Hi. I wanna try regression task similar to prsa. For prsa, I understand the data for training is prepared in dataset/prsa.py. For other new regression task, where and how can I set the target value and feature data?

r-matsuzaka avatar Mar 01 '22 02:03 r-matsuzaka

+1

ianbenlolo avatar Jul 08 '22 20:07 ianbenlolo

Loading the model for me looks something like

from transformers.modeling_utils import load_sharded_checkpoint

tab_net = TabFormerBertLM(custom_special_tokens,
                       vocab=vocab,
                       field_ce=args.field_ce,
                       flatten=args.flatten,
                       ncols=dataset.ncols,
                       field_hidden_size=args.field_hs
                       )
load_sharded_checkpoint(tab_net.model, base_path+"checkpoints1/checkpoint-80/")

I need that i guess because the model is sharded when saved in my case.

It seems like to generate the dataset, load the model and predict. The the output is of shape (dset_size, seq_len*ncols, vocab_size) which i think can be reshaped to (dset_size, seq_len, -1) for prediction.

I think from there you simply pass the dataloader return_labels=True and then run a regression task somehow.
I would love any tips from the authors if possible.

ianbenlolo avatar Jul 27 '22 15:07 ianbenlolo