propr icon indicating copy to clipboard operation
propr copied to clipboard

Question about differences between propr versions

Open suzannejin opened this issue 3 years ago • 6 comments

Hi @tpq ! While running propr using an older version (v4.0.0), I found that for some datasets different coefficients were obtained in comparison to the latest propr version. May I ask if some latest changes that affect the computation of the proportionality coefficients were introduced or am I missing something here?

Below you have an example run on a single-cell dataset downloaded from Skinnider et al 2019 (https://github.com/skinnider/SCT-MoA/blob/master/data/geo/filtered/GSE51254.txt.gz)

get_perb_ori <- function(X) {
    if ('propr' %in% (.packages())) {
        detach('package:propr', unload=T)  # load propr 4.0.0
    }
    library(propr)
    print(packageVersion('propr'))
    rho = perb(X)
    return(rho@matrix)
}
get_perb_tpq <- function(X) {
    if ('propr' %in% (.packages())) {
        detach('package:propr', unload=T)
    }
    library(propr, lib.loc=paste0(.libPaths()[1], "/propr_tpq"))  # load newest propr version
    print(packageVersion('propr'))
    rho = perb(X)
    return(rho@matrix)
}

# load data and run propr
expr = read.delim("data/one-per-publication/GSE51254.txt.gz")
rho_ori = get_perb_ori(expr)
rho_tpq = get_perb_tpq(expr)
identical(rho_ori, rho_tpq)
# [1] FALSE
a = rho_ori[lower.tri(rho_ori)]
b = rho_tpq[lower.tri(rho_tpq)]
summary(a-b)
#     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
# -1.71878 -0.05474  0.03054  0.03111  0.11754  1.47488 

suzannejin avatar May 24 '21 17:05 suzannejin

Hello again! Thanks for a challenging question. I don't think there were any substantial changes, but I checked the source code and it seems I have updated the default zero-handling procedure. (Diff checker pasted below).

Could you try:

expr = read.delim("data/one-per-publication/GSE51254.txt.gz")
rho_ori = get_perb_ori(expr+1)
rho_tpq = get_perb_tpq(expr+1)
identical(rho_ori, rho_tpq)
a = rho_ori[lower.tri(rho_ori)]
b = rho_tpq[lower.tri(rho_tpq)]
summary(a-b)

and let me know what you get?

image

tpq avatar May 24 '21 20:05 tpq

Thanks a lot Thom!!! The values are the same for both propr versions when using expr+1.

Now, I still have one question. The problem I have is that I am trying to benchmark on Skinnider's dataset, and I somehow failed to reproduce their values for rho and phs (whereas the other metrics like pearson, spearman, and zi_kendall, had no problem). Do you think that some part of the propr code might depend more on the working environment/ packages/ libraries / etc than the conventional stats::cor(mat, method = 'spearman', ...)?

suzannejin avatar May 28 '21 12:05 suzannejin

Ah, sorry! I never responded! Is it possible things changed from R 3.6 to R 4.0?

tpq avatar Jun 12 '21 11:06 tpq

Hey Thom, sorry for the late response! I don't think the problem is between R 3.6 and R 4.0 since I have already tried on both R 3.5 and R 4.0+. I am currently checking with Skinnider to see where is the actual problem. I will keep you updated once we find out the cause

suzannejin avatar Jul 10 '21 15:07 suzannejin

Hi @suzannejin, did you manage to solve this problem? I know it was some time ago but I'm experiencing a similar issue right now.

AlSzmigiel avatar Feb 01 '24 10:02 AlSzmigiel

Hey @AlSzmigiel what is the problem for you exactly? Do you find differences between different propr versions, or between different R versions?

suzannejin avatar Feb 01 '24 13:02 suzannejin