GECKO icon indicating copy to clipboard operation
GECKO copied to clipboard

runDLKcat: I/O of DLKcat

Open edkerk opened this issue 2 years ago • 1 comments

Description of the new feature:

As based on #169, runDLKcat:

  • Gathers required information from the model, to be used as input for DLKcat.
  • Either directly runs DLKcat from command line, or writes the input to a text file that the user can use separatedly to run DLKcat.
  • Collect DLKcat input (either directly, or from reading an output file), and prepare it for selectKcatValue to match it back to select the optimal Kcat values and map it back to the model (in model.ec.kcat, and modify model.ec.source).

To gather the input data, some of the below code (which was later moved out of make EcModel again, when deciding to have only one kcat entry per reaction) coudl be useful: https://github.com/SysBioChalmers/GECKO/blob/8f34ae394d91e8a9e65f9385741ca0919160d07a/geckomat/change_model/makeEcModel.m#L87-L111

edkerk avatar Jul 08 '22 23:07 edkerk

Having the basis implemented in #173, a to-do list remains:

  • [x] Write function that can generate metSmiles field by querying PubChem with model.metNames, to be run before making writeDLKcatInput.
  • [x] Match the currency (and other metabolites to ignore?) by their SMILES, instead of by metabolite name. Requires the above function to be run first. Edit: on hold for now, this is not so straightforward.
  • [x] Provide the option to run DLKcat directly, by packaging the required DLKcat files in a download (similar to pre-trained HMMs in RAVEN). For Windows, this will be via WSL (similar as HMMer can be run in RAVEN), and necessary dependencies will be installed via pipenv, all in a subdir that will be kept in the local GECKO installation, which will be included in .gitignore.
  • [x] Have reaction IDs as part of the DLKcatInput.tsv file, removing the need for the DLKcatIDs structure that is now being used to retain this information.

edkerk avatar Sep 21 '22 11:09 edkerk