m3tl icon indicating copy to clipboard operation
m3tl copied to clipboard

Shape Mismatch error for new data set

Open rudra0713 opened this issue 5 years ago • 6 comments

Hey, I have been trying to use a sentiment analysis dataset with the imdb class (mentioned in the notebook) as a multitask.

This is the sample format of the sentiment data: train_data = [['I', 'am', 'going', 'to', 'school', '.'], ['I', 'am', 'not', 'feeling', 'good', '.']] train_labels = [0, 1] test_data = [['I', 'wass', 'so', 'sick', 'yesterday', '.']] test_labels = [1] Unfortunately, this runs to the error

ValueError: generator yielded an element of shape (48,) where an element of shape () was expected.

Can you kindly help me solve this issue?

rudra0713 avatar Oct 21 '19 09:10 rudra0713

Seems it's mixing the data and labels. Did you use the exactly same pre-process function in the notebook?

JayYip avatar Oct 22 '19 01:10 JayYip

Thanks for your response. This is my preprocessing function: `@preprocessing_fn def sentiment_cls(params, mode): # train_data = pickle.load(open("data/sentiment_train_data.p", "rb")) # train_labels = pickle.load(open("data/sentiment_train_label.p", "rb")) # test_data = pickle.load(open("data/sentiment_test_data.p", "rb")) # test_labels = pickle.load(open("data/sentiment_test_label.p", "rb"))

train_data = [['I', 'am', 'going', 'to', 'school', '.'], ['I', 'am', 'going', 'to', 'college', '.']]
train_labels = [0, 1]
test_data = [['I', 'am', 'going', 'to', 'university', '.']]
test_labels = [0]

label_encoder = get_or_make_label_encoder(params, 'sentiment_cls', mode, train_labels + test_labels)

if mode == TRAIN:
    input_list = train_data
    target_list = train_labels
else:
    input_list = test_data
    target_list = test_labels
return input_list, target_list

` The first four lines load the actual dataset. Since that was not working, I tried with toy exaxples, which is also not working.

This is the new problem dictionary: new_problem_type = {'imdb_cls': 'cls', 'sentiment_cls': 'cls'} new_problem_process_fn_dict = {'imdb_cls': imdb_cls, 'sentiment_cls': sentiment_cls}

Please let me know if I am missing something very simple.

rudra0713 avatar Oct 22 '19 18:10 rudra0713

Could you please try changing the input data to ['I am going to school .', 'I am going to college .']?

JayYip avatar Oct 23 '19 06:10 JayYip

I tried that, but the error does not change.

rudra0713 avatar Oct 23 '19 21:10 rudra0713

Hmm, maybe it's a bug. I'll check it out once I get back from traveling.

On Wed, Oct 23, 2019, 2:48 PM rudra0713 [email protected] wrote:

I tried that, but the error does not change.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JayYip/bert-multitask-learning/issues/32?email_source=notifications&email_token=ADS2OTDK26ZRUROEZKLVSXTQQDBELA5CNFSM4JC2JSZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECC7QJQ#issuecomment-545650726, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADS2OTANQKBYCU4ANZBVBFLQQDBELANCNFSM4JC2JSZQ .

JayYip avatar Oct 25 '19 23:10 JayYip

Thanks. Please, let me know if you find anything.

rudra0713 avatar Oct 30 '19 08:10 rudra0713