RAVEN
RAVEN copied to clipboard
help: randomSampling with human-GEM
Hello everyone! I tried to perform random sampling on a model generated by tINIT from the human1 model but the program appears to be hung. I set the show Progress argument to true but still don't get anything after Preparing Random Sampling: ready with 3800/3863 rxns. It's been stuck for about 24 hours so I'm wondering if anyone else had the issue. I set the number of samples to just 2 to test it but still don't get anything
I'm uploading a model I'm trying to perform random sampling as a zip file.
I'm also pasting the code I used first to build the model with tINIT and then the randomSampling command I used:
function build_models(human_GEM, transcriptomic_file, essentialTasks, output_path)
transcriptomic_data = readtable(transcriptomic_file);
% extract the tissue and gene names
data_struct.tissues = transcriptomic_data.Properties.VariableNames(2:end)'; % sample (tissue) names
data_struct.genes = transcriptomic_data.Genes; % gene names
data_struct.levels = table2array(transcriptomic_data(:, 2:end)); % gene TPM values
data_struct.threshold = 1;
for k = 1:length(data_struct.tissues)
tissue = data_struct.tissues{k}; % must match the tissue name in data_struct.tissues
celltype = []; % used if tissues are subdivided into cell type, which is not the case here
hpaData = []; % data structure containing protein abundance information (not used here)
arrayData = data_struct; % data structure with gene (RNA) abundance information
metabolomicsData = []; % list of metabolite names if metabolomic data is available
removeGenes = true; % (default) remove lowly/non-expressed genes from the extracted GEM
taskFile = []; % we already loaded the task file, so this input is not required
useScoresForTasks = true; % (default) use expression data to decide which reactions to keep
printReport = true; % (default) print status/completion report to screen
taskStructure = essentialTasks; % metabolic task structure (used instead "taskFile")
params = []; % additional optimization parameters for the INIT algorithm
paramsFT = []; % additional optimization parameters for the fit-tasks algorithm
params.TimeLimit = 5000;
built_model = getINITModel2(human_GEM, tissue, celltype, hpaData, arrayData, metabolomicsData, removeGenes, taskFile, useScoresForTasks, printReport, taskStructure, params, paramsFT);
built_model.id = tissue;
checkTasks(built_model, [], true, false, false, essentialTasks);
write_path = [pwd '\' output_path '\' tissue '_model.mat'];
save(write_path, 'built_model');
end
end
[solutions, goodrxns] = randomSampling(all_models{1}, 2, true, false, true, [], false);
System information
- Please report:
- RAVEN version: RAVEN 2.0
- Windows 10
I hereby confirm that I have:
- [X ] Followed the guidelines to install RAVEN.
- [ X] Checked that a similar issue does not already exist
- [ X] If suitable, needed, asked first in the Gitter chat room about the issue
@manas-kohli thanks for reporting this. Which specific version of Human1 was used in this analysis?
Hi hao, I used the 1.11 version released in February of this year, I haven't updated yet to the 1.12 version yet (don't know if that can be causing an issue)
don't know if that can be causing an issue
no, it shouldn't. will then try to duplicate your work
In #439 I have somewhat changed the reporting of the function progress, as it actually got stock trying to get the first sample. Rerunning your model with the code in that PR will not solve the problem that you encounter, it would just more accurately report the function's progress.
The issue is that the model cannot support any non-zero fluxes (ignoring reactions that can form loops). Running solveLP(built_model)
also gives a zero value of the objective value. I cannot directly spot the problem, as it seems like you have exchange reactions open. What solver do you use, particularly when running getINITModel2
? If it is glpk
, then you might want to try it with gurobi
instead.
Running
solveLP(built_model)
also gives a zero value of the objective value.
If this is the case, then I have two additional questions @manas-kohli (sorry if they have been asked before):
- Were boundary metabolites added to the Human-GEM model (using
closeModel
oraddBoundaryMets
) before using it withgetINITModel2
? - Did your
checkTasks(built_model, [], true, false, false, essentialTasks);
command run after building the model show that all tasks were passed?
Hi ed and jon,
To answer your questions:
- Yes I did add boundary metabolites to the ihuman model: I ran these commands:
load('Human-GEM.mat'); ihuman = addBoundaryMets(ihuman);
- I did run the checkTasks command and every task was passed
Steps 1) and 2) were done with getINITModel2 to build the model initially
-
I now noticed that the solveLP(model) gives me 0 as ed pointed out. I fixed this and removed the boundary metabolites as well as opened the transport reactions necessary for growth (I basically ran the function setHamsMedium). Now I get a solveLP value of -53.8195 for the objective function. I'm attaching the model but once again am getting the same issue with randomSampling.
-
I use gurobi and not glpk for my solver
As a quick follow up, I don't know if anyone else can reproduce this but when I try to run randomSampling on the base ihuman (with or without boundary metabolites), I'm running into the same problem of randomSampling not reporting anything after the preparation phase. I can be doing something really wrong so apologies if the error is very trivial and I'm missing something obvious
Same behaviour here: randomSampling
directly on human-GEM 1.12.0 cannot get any random samples.
I also tried with earlier versions: human-GEM 1.9.0 and RAVEN 2.5.0: same behaviour.
Not sure if randomSampling
has routinely been used on human-GEM derived models? @JonathanRob
@edkerk I have never used randomSampling
with Human-GEM or Human-GEM-derived models, but I took a look at the code to see what could be happening.
For me, when using Gurobi, I get infeasible solutions for just default biomass maximization after changing Human-GEM max bounds to +/- Inf. It's strange, because in the past this wasn't the case, so I'm not sure if it's due to changes in the model or updates to Gurobi. Regardless, I would recommend moving the feasibility check in randomSampling
to take place after the max bounds adjustment, otherwise you're just going to get all (or nearly all) infeasible solutions.
I was able to run randomSampling
with the third input (replaceBoundsWithInf
) set to false, though I realize this may result in more solutions with loops.