gseapy.prerank - errors for small ranked lists
Hi, Thanks to implement GSEA on python, it's now quicker to perform analyses.
Please note a bug with gseapy.prerank. With default parameters and when the ranked list is small (<= 15), the function returns
No gene sets passed through filtering condition!!!, try new parameters again! Note: check gene name, gmt file format, or filtering size.
I resolved the problem by lowering down the min_size parameter. So, as long as, min_size is higher than the ranked list size, there are no errors. But this parameter should apply to the gene sets and not to the expression dataset, according to the documentation of GSEA software.
Even with no elements in the ranked list matching with the gene sets, the function should return a warning but not an error.
Moreover, in case of errors, the gene set list passed to the function is emptied.
Best regards, Michaël
This could be done. However, if our ranked gene list are smaller than 15, prerank analysis is still make sense to us? @michaelpierrelee
Moreover, in case of errors, the gene set list passed to the function is emptied.
This is really annoying, why is that the case? Would it be possible to change the behaviour? Thank you
@michaelpierrelee Thank you for your post. I have the same error and have no idea what to do until I saw this post.
For me, I have to increase the max_size parameter to a number that is larger than the ranked list size.
Even though this error does not exist anymore, I still do not quite understand the meaning of setting the min_size and max_size parameters.
I met this error too ! How to to solve the problem ?
min_size and max_size are used for filtering how many gene members of a pathway (gene set) should overlap with your ranked list.
A ranked list containing all expressed genes in your experiment (e.g. whole transcriptome ) is recommended to run the GSEA analysis
Hi, I got the same error "Exception: No gene sets passed through filtering condition" now. I had 92 genes saved in dataframe with its P-value and ranked. It keeps return this error, even I tried differen min_size and permutation_num
for everyone still encountering this issue, here's a solution in python: https://decoupler-py.readthedocs.io/en/latest/generated/decoupler.run_gsea.html
Hi @Cher-HAN, you need to make your gene symbol identifiable for the GMT file you've chosen. By default, gene symbols should be all capitalized when using Enrichr libraries as GMT input