MatchZoo
MatchZoo copied to clipboard
Preprocessor.fit_transform does not initialise preprocessor.context
Describe the bug
When calling
preprocessor = mz.preprocessors.DSSMPreprocessor() train_processed = preprocessor.fit_transform(train_pack)
the preprocessor does not automatically initialise preprocessor.context like when calling
train_processed = preprocessor.fit(train_pack)
To Reproduce
import matchzoo as mz
import pandas as pd
path = "/results/DPH_3.res" #any file
table = pd.read_csv(path,sep='\t')
df = pd.DataFrame({ #any format
'text_left': table['q'],
'text_right': table['doc'],
'id_left': table['q_id'],
'id_right': table['doc_id'],
'label': table['label']
})
pack = mz.pack(df)
train_pack = pack[:10000]
valid_pack = pack[10000:15000]
predict_pack = pack[15000:20000]
preprocessor = mz.preprocessors.DSSMPreprocessor()
preprocessor.fit_transform(train_pack)
print(preprocessor.context) #output is {}
preprocessor.fit(train_pack)
print(preprocessor.context) #output is not empty, all params are initialised
train_processed = preprocessor.transform(train_pack)
valid_processed = preprocessor.transform(valid_pack)
predict_processed = preprocessor.transform(predict_pack)
Describe your attempts
- [x] I checked the documentation and found no answer
- [x] I checked to make sure that this is not a duplicate issue
Current workaround: Separately perform preprocessor.fit() and preprocessor.transform()
Context
- OS : macOS 10.13
- Hardware : CPU only
- Matchzoo version : 2.1.0
Since I don't have your data, I tested it with our toy data. I could not reproduce the bug you are reporting.
Here's the thing I tried:
import matchzoo as mz
pp = mz.preprocessors.DSSMPreprocessor()
dp = mz.datasets.toy.load_data()
pp.fit_transform(dp)
print(pp.context) # actually prints correctly fitted context
pp.fit(dp)
print(pp.context) # prints the same thing