ontobio icon indicating copy to clipboard operation
ontobio copied to clipboard

Enrichment notebook not working

Open lpalbou opened this issue 4 years ago • 4 comments

https://nbviewer.jupyter.org/github/biolink/ontobio/blob/master/notebooks/Phenotype_Enrichment.ipynb

and binder: https://hub.gke.mybinder.org/user/biolink-ontobio-actww75g/notebooks/notebooks/Phenotype_Enrichment.ipynb

The notebook creates but step 7 doesn't create an enrichment list: Screen Shot 2019-09-19 at 6 51 22 PM

I have tried locally and it's the same thing. I also changed the threshold but it doesn't come from that either.

Possible reason: NCBIGene ids are not found as subjects of the associations ?

@cmungall @deepakunni3

[EDIT]

  • the aset variable seems correctly initialized: ontology is here, association_map also with indeed NCBIGene:xxx
  • none of the NCBIGene ids provided in the example seems to be in the association_map, so maybe it's just a bad example
  • however I did try to create randomly a list of NCBIGene and to attempt an enrichment with a non limiting threshold and set.enrichment_test() is still not giving me results

lpalbou avatar Sep 20 '19 01:09 lpalbou

Let's look when deepak is back

can we use notebooks as unit tests?

On Thu, Sep 19, 2019 at 6:54 PM lpalbou [email protected] wrote:

https://nbviewer.jupyter.org/github/biolink/ontobio/blob/master/notebooks/Phenotype_Enrichment.ipynb

and binder: https://hub.gke.mybinder.org/user/biolink-ontobio-actww75g/notebooks/notebooks/Phenotype_Enrichment.ipynb

The notebook creates but step 7 doesn't create an enrichment list: [image: Screen Shot 2019-09-19 at 6 51 22 PM] https://user-images.githubusercontent.com/24249870/65293025-8cd05780-db0e-11e9-9d95-5a3345e181d7.png

I have tried locally and it's the same thing. I also changed the threshold but it doesn't come from that either.

Possible reason: NCBIGene ids are not found as subjects of the associations ?

@cmungall https://github.com/cmungall @deepakunni3 https://github.com/deepakunni3

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biolink/ontobio/issues/376?email_source=notifications&email_token=AAAMMOP3USXWSZF2UZY4LE3QKQUNJA5CNFSM4IYRYET2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HMR5QLQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAMMOIV4VDNMZVLBWHGEC3QKQUNJANCNFSM4IYRYETQ .

cmungall avatar Oct 02 '19 15:10 cmungall

After digging into the ontobio code and the notebook: the reason why the notebook doesn't work is because the input list used is NCBIGene where as the call to Monarch for gene-phenotype associations returns HGNC (clique leader) as the subject instead of NCBIGene.

The resulting AssociationSet created by Ontobio has associations for HGNC.

At the time of enrichment, these genes are not remapped back to the original form (NCBIGene). Which leads to no enrichment being observed.

@cmungall The proposal to fix this would be to perform remapping after fetching associations in assocmodel.py or golr_associations.py.

As an additional point to note - this notebook was originally written when NCBIGene was the clique leader for NCBITaxon:9606, which is why the notebook worked before even though the store that it fetches associations from has changed since.

deepakunni3 avatar Oct 03 '19 21:10 deepakunni3

Of course this gets complicated when an input list contains mix of two separate namespaces.

deepakunni3 avatar Oct 03 '19 22:10 deepakunni3

Shouldn't be an issue. Either you do an initial normalization step and maintain an internal map. Or the remote service returns all synonymous IDs with the payload.

On Thu, Oct 3, 2019 at 3:27 PM Deepak [email protected] wrote:

Of course this gets complicated when an input list contains mix of two separate namespaces.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biolink/ontobio/issues/376?email_source=notifications&email_token=AAAMMOMCZDRFESRRO7BRWVLQMZWWTA5CNFSM4IYRYET2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAJZC3A#issuecomment-538153324, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAMMOL6JAOL2Z5TFW5SO3DQMZWWTANCNFSM4IYRYETQ .

cmungall avatar Oct 03 '19 22:10 cmungall