BioREx icon indicating copy to clipboard operation
BioREx copied to clipboard

Inconsistent Relationship Extraction Results Compared to PubTator3 Website

Open dillonl opened this issue 5 months ago • 1 comments

I've encountered an issue where I'm unable to reproduce the relationship extraction results from the PubTator3 website (example: 19394258) using BioREx. When I run the tool using the suggested model, it only outputs one relationship, whereas the PubTator3 site identifies twelve.

Additionally, when I run the code as is (run_test_pred.sh), it crashes due to an empty intermediate file (out_processed.tsv). I've noticed that if I hardcode the relationships in src_tgt_pairs variable in src/convert_pubtator_2_tsv.py, the process continues past this issue, but it still doesn't match the expected output.

I suspect this might be due to differences in the models used. The README lists several models, but none seem to produce output that matches what the website provides.

Could you clarify whether the model used on the website is available in the repository? Also, any guidance on how you run this tool on the PubTator3 website would be appreciated.

Thanks you.

dillonl avatar Sep 17 '24 23:09 dillonl