goatools icon indicating copy to clipboard operation
goatools copied to clipboard

Error: Only few genes/proteins in the study are found in the background population.

Open arumds opened this issue 5 years ago • 6 comments

I have been using GOA tools the latest zipped version, and getting the below error that only few input genes are in the background population. Is there a way to get through this error or is it that it doesn’t work with the input gene count.

$ /goatools-master/scripts/find_enrichment.py --obo ../go-basic.obo --pval=0.05 --indent --method fdr_bh,bonferroni --outfile GO_Genes.tsv,GO_Genes.xlsx RDA.Genes ../EnsemblKnownProteinCodingbackgroundID.txt ../SlimEnsemblGenes_GOAssocationID.txt

../go-basic.obo: fmt(1.2) rel(2019-05-09) 47,407 GO Terms ARGS GoeaCliFnc Namespace(alpha=0.05, annofmt=None, compare=False, ev_exc=None, ev_help=True, ev_help_short=True, ev_inc=None, filenames=['RDA.Genes', '../EnsemblKnownProteinCodingbackgroundID.txt', '../SlimEnsemblGenes_GOAssocationID.txt'], goslim='goslim_generic.obo', id2sym=None, indent=True, method='fdr_bh,bonferroni', min_overlap=0.7, no_propagate_counts=False, ns='BP,MF,CC', obo='../go-basic.obo', outfile='GO_Genes.tsv,GO_Genes.xlsx', outfile_detail=None, pval=0.05, pval_field=None, pvalcalc='fisher', ratio=None, sections=None, taxid=9606) HMS:0:00:00.408382 93,186 annotations READ: ../SlimEnsemblGenes_GOAssocationID.txt Study: 295 vs. Population 20197

WARNING: only 0.586440677966 fraction of genes/proteins in study are found in the population  background.


ERROR: only 0.586440677966 of genes/proteins in the study are found in the background population. Please check.

arumds avatar May 27 '19 12:05 arumds

I have encountered with the same error, i suppose it's the unique genes in study file cause this error. But I don't know how to make it work

mayupsc avatar Jun 26 '19 04:06 mayupsc

@mehar-GIT and @mayupsc,

To further investigate with what you are seeing, can you provide these items?

  1. A log of all messages to the screen during your run
  2. A copy of the top 20 lines of the population file
  3. A copy of the top 20 lines of the study file
  4. A copy of the top 20 lines of the annotation file

With this information, we can proceed further...

Thank your taking the time to contact us and for your interest in GOATOOLS.

dvklopfenstein avatar Jun 28 '19 15:06 dvklopfenstein

Hopefully, you are up and running and so did not need to provide the additional information necessary to help solve you issue. In the hopes that this is the case, I close this issue now.

Please open an new issue if you need us to take a look. Thank you for taking the time to write us and for your interest in GOATOOLS.

dvklopfenstein avatar Jul 29 '19 14:07 dvklopfenstein

I'm seeing the same error with the sample data (downloaded from https://github.com/tanghaibao/goatools/tree/main/tests/data):

$ find_enrichment.py small_study small_population small_association --outfile=1.xlsx --pval=0.05 --method=fdr_bh --pval_field=fdr_bh
go-basic.obo: fmt(1.2) rel(2021-05-01) 47,284 GO Terms
HMS:0:00:00.069865   6,309 annotations READ: small_association 
Study: 38 vs. Population 2000


WARNING: only 0.39473684210526316 fraction of genes/proteins in study are found in the population background.


ERROR: only 0.39473684210526316 of genes/proteins in the study are found in the background population. Please check.

barrantesisrael avatar May 12 '21 11:05 barrantesisrael

I am seeing the same error.

find_enrichment.py --alpha=$alpha --pval=$p_val --indent --obo $obo $study $population $association > $output

WARNING: only 0.018268156424581006 fraction of genes/proteins in study are found in the population background.

ERROR: only 0.018268156424581006 of genes/proteins in the study are found in the background population. Please check.

akaur1988 avatar Jul 27 '22 13:07 akaur1988

Hi, did anyone solve this problem? Thanks a lot.

HelloWorldLTY avatar Oct 15 '22 01:10 HelloWorldLTY