GOSemSim
GOSemSim copied to clipboard
:golf: GO-terms Semantic Similarity Measures
Hi, See in the picture what I tried, can't follow the code to find the error. Where the error should be I've tried to manually run the other lines to...
@GuangchuangYu , @huerqiang At the Bioconductor support forum an issue/error was reported regarding the function `buildGOmap`: https://support.bioconductor.org/p/9156358/ I had a quick look at it, and it seems this is because...
1. 标准化`read.gaf`和`read.blast2go`的输出为data.frame 我们现在有[read.gaf](https://github.com/YuLab-SMU/GOSemSim/blob/devel/R/parseGAF.R)和[read.blast2go](https://github.com/YuLab-SMU/GOSemSim/blob/devel/R/readBlast2go.R),两个函数的输出不太一致,改成都输出`data.frame`,两column名字和顺序固定下来,分别是Gene, GO,事实上这两种数据都应该有GO分支的信息,有的话,加第三个Column,Ontology,标准化为MF, CC和BP,像blast2go的输出有GO Domain就是这个信息。 当前`read.gaf()`有额外的输出,这一块分离出来。 2. 支持data.frame为输入 那么标准化上面两函数之后,如果我们能支持data frame来分析的话,那这两函数的输出就能直接用了,以及可以让用户自己提供data.frame,能直接用。 这个只要衔接[godata](https://github.com/YuLab-SMU/GOSemSim/blob/devel/R/godata.R)这个函数,让OrgDb支持输入一个data.frame就行了。里面调用的函数,调整一下。如此一来,GOSemSim就打通了,因为上面的东西,全部都基于`godata()`的输出。
1. 现在支持非常多的物种,因为不是内置支持,而是先外部处理OrgDb,准备好GO注释。 + OrgDb可以通过AnnotationHub获得,如果没有的话,也可以自己通过AnnotationForge构建,另外[AnnotationHub检索回来的OrgDb也可以打包为R包](https://support.bioconductor.org/p/92004/),以供后续使用。 + 支持GAF文件 (衔接蛋白质组注释,. + Blast2GO输出 + 支持用户输入data.frame, . 即便是用户自己注释的新基因组,也能够分析,少去了搞OrgDb的门槛。 2. 基于信息含量的几种方法是我用C++重写的,后来又有[@alyst优化](https://github.com/GuangchuangYu/GOSemSim/pull/1),现在的速度更快。 3. 实现了新方法TCSS 4. 支撑其它包的语义相似性度量,包括disease ontology, human phenotype ontology, mouse phenotype ontology, Medical Subject Headings等。 +...
Hi, thank you for this package. It is very useful! This is not a bug but a potential issue. IC is calculated on the full set of Gene-GO annotations at...
Hi, I'm trying to use GOSemSim to compute pairwise similarity scores for a large set of genes and it seems that the Resnik measure at least is incorrect because the...
Although I downloaded the GOSemSim package to R (V4.1.2) without any problems with the codes below, I cannot find the package in the library after the download is finished. What...
I have found that different types of `hsGO` have no effect on the similarity results. R codes as below: ```R go1
Is it possible to replace the `sapply` and `apply`'s in the code with `BiocParallel::bplapply`? For example `mclusterSim` becomes really slow when the number of clusters / the size of the...
Dear authors, I am trying to calculate semantic similarity between some GO terms based on the information content methods. ```R # GO.db retrieved by 2020-09-10, Bioconductor version 3.12 atGO