Human-GEM icon indicating copy to clipboard operation
Human-GEM copied to clipboard

feat: Automatic GEM scoring using custom memote integration

Open JonathanRob opened this issue 4 years ago • 10 comments

Description of the issue:

The memote package provides a really nice standardized tool for evaluating the quality of a GEM, and has been used during the curation of Human-GEM to identify problems or weak points in in the model. Although I feel a complete integration of memote may be overkill and potentially incompatible with our current repo framework, I think a custom lightweight integration using GitHub actions could be a nice way to automatically track some scores of interest, such as % reactions balanced, annotation coverage, etc.

The implementation would be very straightforward, and would return a JSON that could be parsed and presented/stored in whichever format we choose. However, some questions remain:

  1. When should the scoring action be run?
    • On every PR (to devel/master/etc)?
    • Only with new releases?
  2. Where should the scores be shown?
    • As a comment on a PR?
    • In a newly generated text/markdown file somewhere in the repo?
  3. Should the scores be stored in some sort of log file?
    • Keep a historical log of scores over time
    • Would likely be a headache since we may change which tests are run
    • Maybe replace the existing "score file" with a new one anytime the action is run?
  4. How should the output be formatted?
    • Will depend on some of the questions above
    • tsv, markdown, etc.

Expected feature/value/output:

A lightweight, automated scoring script to periodically report a few model statistics of interest.

I hereby confirm that I have:

  • [X] Checked that a similar issue does not exist already

JonathanRob avatar Apr 21 '21 15:04 JonathanRob

I'd suggest having a look at how this is set up for Yeast-GEM in this PR. It uses GH Actions to do both a simple memote run and also memote history. I imagine these workflows can be copy-pasted to a large extend.

mihai-sysbio avatar Apr 21 '21 15:04 mihai-sysbio

@JonathanRob very good point to get memote test integrated.

@mihai-sysbio indeed, would be ideal to maintain a similar (or same) GH action as Yeast-GEM.

haowang-bioinfo avatar Apr 23 '21 07:04 haowang-bioinfo

Initial thoughts:

  1. on every PR (as long as the output is compact)
  2. as a PR comment (to keep the repo clean)
  3. don't keep anything, as memote's history report can comb through the entire history when producing the report
  4. in the PR comment, something markdown-like and easy to read

A longer output can be printed out in the Action run, where it will be stored for 90 days (I think that's the default setting). One could also generate a full html report, and store that in the main branch, but that is outside the scope as stated:

A lightweight, automated scoring script to periodically report a few model statistics of interest.

mihai-sysbio avatar Apr 23 '21 17:04 mihai-sysbio

some comments to your thoughts @mihai-sysbio

  1. on every PR (as long as the output is compact)

let's start with the PR from develop to master

  1. as a PR comment (to keep the repo clean)

~agree~ this seems infeasible

  1. don't keep anything, as memote's history report can comb through the entire history when producing the report
  2. in the PR comment, something markdown-like and easy to read

Markdown report with a few statistics sounds good

haowang-bioinfo avatar Apr 24 '21 10:04 haowang-bioinfo

let's start with the PR from develop to master

This is going to be weird to test and merge before it gets to master.

mihai-sysbio avatar Apr 24 '21 10:04 mihai-sysbio

This is going to be weird to test and merge before it gets to master.

@mihai-sysbio not sure what you mean. Fun fact: memote uptakes models only in xml, which exists only in master and is updated once a PR was merged into master from develop

haowang-bioinfo avatar Apr 24 '21 12:04 haowang-bioinfo

@Hao-Chalmers I think memote requires cobra to be installed, in which case maybe the xml format can be obtained with

cobra.io.write_sbml_model

Edit: this approach might yield low scores on the annotations, so these should not be included in the PR comments.

mihai-sysbio avatar Apr 24 '21 13:04 mihai-sysbio

yes, there will be low scores in any PRs because a xml file with integrated annotation fields is updated only in master branch, to which a markdown report might be added. Not sure in a standalone file or an integrated section of README, or else?

haowang-bioinfo avatar Apr 26 '21 06:04 haowang-bioinfo

@Hao-Chalmers, with the help of an Action runner with Matlab, the model could be exported via Raven in xml format including all the annotations, exactly like on master, before running memote. These temporary files would not be committed, but some memote scores may be kept in a file or (my preference) in the PR comments.

mihai-sysbio avatar Apr 28 '21 14:04 mihai-sysbio

A workflow to run memote on a PR has been merged and released. It shouldn't be too time consuming to adopt this, or other memote actions, in Human-GEM. However, in line with previous comments, several questions need clear answers:

  • [ ] under which conditions should the action be triggered?
  • [ ] what parameters should be used in the memote command?
  • [ ] which parts of the report should be posted as PR comments (eg a small selection of scores)?

mihai-sysbio avatar Aug 12 '21 10:08 mihai-sysbio