MatchZoo-py icon indicating copy to clipboard operation
MatchZoo-py copied to clipboard

Error during training because of float length of sequence(?)

Open littlewine opened this issue 4 years ago • 1 comments

Describe the bug

Hi, I have the following issue: when I am trying to train my model using trainer.run(), I get the following error:

Traceback (most recent call last):
  File "/Users/xx/.conda/envs/QL_QA/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-67-041e2033e90a>", line 1, in <module>
    trainer.run()
  File "/Users/xx/.conda/envs/QL_QA/lib/python3.7/site-packages/matchzoo/trainers/trainer.py", line 227, in run
    self._run_epoch()
  File "/Users/xx/.conda/envs/QL_QA/lib/python3.7/site-packages/matchzoo/trainers/trainer.py", line 251, in _run_epoch
    for step, (inputs, target) in pbar:
  File "/Users/xx/.conda/envs/QL_QA/lib/python3.7/site-packages/tqdm/std.py", line 1091, in __iter__
    for obj in iterable:
  File "/Users/xx/.conda/envs/QL_QA/lib/python3.7/site-packages/matchzoo/dataloader/dataloader.py", line 112, in __iter__
    self._handle_callbacks_on_batch_unpacked(x, y)
  File "/Users/xx/.conda/envs/QL_QA/lib/python3.7/site-packages/matchzoo/dataloader/dataloader.py", line 134, in _handle_callbacks_on_batch_unpacked
    self._callback.on_batch_unpacked(x, y)
  File "/Users/xx/.conda/envs/QL_QA/lib/python3.7/site-packages/matchzoo/dataloader/callbacks/padding.py", line 158, in on_batch_unpacked
    self._pad_word_value, dtype=dtype)
  File "/Users/xx/.conda/envs/QL_QA/lib/python3.7/site-packages/numpy/core/numeric.py", line 325, in full
    a = empty(shape, dtype, order)
TypeError: 'numpy.float64' object cannot be interpreted as an integer

I am not 100% sure, but it seems to me that the error is caused by the fact that in my preprocessed datapack, length_right is a float instead of an int (that seems to be the case in the toy datasets.).

>> toy_datapack.frame()[['length_right','length_left']]
Out[13]: 
    length_right  length_left
0             58           29
1             41           29
2             41           29
3             61           29
4            128           29
5            126           85
6            128           85

while

train_pack = mz.DataPack(relation=relation[relation.id_left.isin(qids['train'])].reset_index(drop=True),
                             left=left[left.index.isin(qids['train'])],
                             # right=right_train,
                             right=right_dict['train'],
                             )
    train_pack.frame().head().dtypes

Out[78]: 
id_left          object
text_left        object
id_right         object
text_right       object
length_right    float64
label           float64
dtype: object

It also seems weird to me that this is happening, since to my understanding, the built-in python len function should return an int.

right_train['length_right'] = right_train.text_right.apply(len)
Out[15]: 

                                                                  text_right  length_right
id_right                                                                                  
clueweb09-en0007-21-42346  Welcome | Logout Log In | Sign Up The Huffingt...          4039
clueweb09-enwp03-01-16807  Ann Dunham From Wikipedia, the free encycloped...         32225
clueweb09-en0010-93-11767  Home Contact Us Bookmark Us Receive Family Tre...          5112
clueweb09-enwp01-36-17161  Maya Soetoro-Ng From Wikipedia, the free encyc...          8279
clueweb09-enwp00-34-05344  Barack Obama, Sr. From Wikipedia, the free enc...         14448
clueweb09-enwp00-34-05347  Barack Obama, Sr. From Wikipedia, the free enc...         14478

I am preparing my data using mz.autoprepare and the models I've tried to use are KNRM and DRMM, but the same issue still occurs.

My matchzoo.version`. = 1.1.1

littlewine avatar Mar 30 '20 19:03 littlewine

I confirmed what I mentioned earlier regarding the cause of this problem: I changed my script to a previous version, where a different document selection process (basically initial retrieval) was used, and indeed I observed that the training was running normally and the length_right was int instead of float. Any ideas on what might be going wrong here or whether there should be a catch in the framework to fix that (eg. converting floats to int)?

littlewine avatar Mar 31 '20 12:03 littlewine