PaddleHelix icon indicating copy to clipboard operation
PaddleHelix copied to clipboard

Is there a script file for model inference in GEM?

Open gaoshan2006 opened this issue 2 years ago • 3 comments

I am looking through the code of "apps/pretrained_compound/ChemRL/GEM", I see there is pre-train and finetune files for training model, but i have not found code for running inference. Is there an available script for inference if i want to do a simple test ( load model and dataset, then do inference) ? Thanks a lot!

gaoshan2006 avatar Jun 10 '22 07:06 gaoshan2006

Hi gaoshan, thank you for using paddlehelix! As for the model inference, you can refer to the Part III: Downstream Inference of the pretrained_compound tutorial. Hope this can be helpful to you.

Noisyntrain avatar Jun 10 '22 08:06 Noisyntrain

Hi gaoshan, thank you for using paddlehelix! As for the model inference, you can refer to the Part III: Downstream Inference of the pretrained_compound tutorial. Hope this can be helpful to you.

Thanks! I will try

gaoshan2006 avatar Jun 10 '22 08:06 gaoshan2006

I refer to the above tutorial pointed by @Noisyntrain to run an inference case for a finetuned qm7 model, However , it looks the tutorial code does not work for qm7 model. Here is my code , just a few slight changes from the tutorial example ,

All the json config file of inference is same with that of model training

def main(args): compound_encoder_config = load_json_config( './model_configs/geognn_l8.json') task_type = 'regr' dataset_name = 'qm7' task_names = get_downstream_task_names(dataset_name, './chemrl_downstream_datasets/qm7')

model_config = load_json_config( 'model_configs/down_mlp3.json' )
model_config['task_type'] = task_type
model_config['num_tasks'] = len(task_names)

compound_encoder = GeoGNNModel(compound_encoder_config)
model = DownstreamModel(model_config, compound_encoder)
model.set_state_dict(paddle.load( './model/model.pdparams' ))  <- this model is trained by the same config file
transform_fn = DownstreamTransformFn(is_inference=True )

collate_fn = DownstreamCollateFn(
        atom_names=compound_encoder_config['atom_names'], 
        bond_names=compound_encoder_config['bond_names'],
        bond_float_names=compound_encoder_config['bond_float_names'],
        bond_angle_float_names=compound_encoder_config['bond_angle_float_names'],
        is_inference=True,
        task_type=task_type)

SMILES="Cc1c(O)nc2ccccn2c1=O"

graph=collate_fn([transform_fn({'smiles':SMILES})])
preds=model(graph.tensor()).numpy()[0]

print('SMILES:%s' % SMILES)
print('Predictions:')
print(str(preds))
for name,prob in zip(task_names.preds):
     print(" %s:\t%s" % (name, prob))

Then I got the following errors:

Traceback (most recent call last): File "inference_regr.py", line 206, in main(args) File "inference_regr.py", line 177, in main preds=model(graph.tensor()).numpy()[0] AttributeError: 'tuple' object has no attribute 'tensor'

I do not understand what differences between 'qm7' model and the tutorial model ? It looks the tutorial should be applicable to all the finetune models here, right? Could you give me some hints please? Thanks in advance

gaoshan2006 avatar Jun 15 '22 10:06 gaoshan2006