MatchZoo icon indicating copy to clipboard operation
MatchZoo copied to clipboard

"Error: Index out of bounds for axix 0" While making prediction with DRMM

Open thiziri opened this issue 4 years ago • 2 comments

Describe the bug

I've trained and evaluated the DRMM model successfully. When I traied to make predictions in a new dataset, I run this code:

test_generator = mz.DataGenerator(data_pack=valid_pack_pp[:10], mode='point', callbacks=[hist_callback])
test_x, test_y = test_generator[:]
prediction = drmm_model.predict(test_x)

and I've got the following error message:


IndexError Traceback (most recent call last) in 1 test_generator = mz.DataGenerator(data_pack=valid_pack_pp, mode='point', callbacks=[hist_callback]) ----> 2 test_x, test_y = test_generator[:] 3 prediction = drmm_model.predict(test_x)

C:\ProgramData\Anaconda3\lib\site-packages\matchzoo\data_generator\data_generator.py in getitem(self, item) 133 self._handle_callbacks_on_batch_data_pack(batch_data_pack) 134 x, y = batch_data_pack.unpack() --> 135 self._handle_callbacks_on_batch_unpacked(x, y) 136 return x, y 137

C:\ProgramData\Anaconda3\lib\site-packages\matchzoo\data_generator\data_generator.py in _handle_callbacks_on_batch_unpacked(self, x, y) 197 def _handle_callbacks_on_batch_unpacked(self, x, y): 198 for callback in self._callbacks: --> 199 callback.on_batch_unpacked(x, y) 200 201 @property

C:\ProgramData\Anaconda3\lib\site-packages\matchzoo\data_generator\callbacks\histogram.py in on_batch_unpacked(self, x, y) 32 def on_batch_unpacked(self, x, y): 33 """Insert match_histogram to x.""" ---> 34 x['match_histogram'] = _build_match_histogram(x, self._match_hist_unit) 35 36

C:\ProgramData\Anaconda3\lib\site-packages\matchzoo\data_generator\callbacks\histogram.py in _build_match_histogram(x, match_hist_unit) 62 x['length_right'].tolist()) 63 for pair in zip(text_left, text_right): ---> 64 match_hist.append(match_hist_unit.transform(list(pair))) 65 return np.asarray(match_hist)

C:\ProgramData\Anaconda3\lib\site-packages\matchzoo\preprocessors\units\matching_histogram.py in transform(self, input_) 47 matching_hist = np.ones((len(text_left), self._hist_bin_size), 48 dtype=np.float32) ---> 49 embed_left = self._embedding_matrix[text_left] 50 embed_right = self._embedding_matrix[text_right] 51 matching_matrix = embed_left.dot(np.transpose(embed_right))

IndexError: index 740 is out of bounds for axis 0 with size 385

To Reproduce

Here is a data sample:

valid_pack_pp = preprocessor.transform(valid_pack)
valid_pack_pp.frame()

image

Describe your attempts

  • [x] I checked the documentation and found no answer
  • [x ] I checked to make sure that this is not a duplicate issue

@yangliuy @pl8787 @wordreference @zenogantner @faneshion someone could help, please?

thiziri avatar Mar 10 '20 11:03 thiziri

Did you process the train/valid/test data use the same Preprocessor?

faneshion avatar Apr 07 '20 00:04 faneshion

Normally yes :/

thiziri avatar Apr 17 '20 08:04 thiziri