DiaNN icon indicating copy to clipboard operation
DiaNN copied to clipboard

PG.Quantity == PG.Top in 85 % cases

Open bhagwataditya opened this issue 2 years ago • 1 comments

Dear Vadim,

Thank you for developing diann. It is very useful and we love it! We have a dilution series experiment, so MaxLFQ cannot be used (the samples will change). Instead we want to use PG.Quantity. We noticed that this quantity in 85 % of cases corresponds with the Top1 precursor Intensity. But in 15 % of cases this is not so. We wanted to understand the logic behind this :) Could you help?

A reproducible example

# Read
    require(magrittr)
    require(data.table)
    url <- 'https://bitbucket.org/graumannlabtools/autonomics/downloads/szymanski22.report.tsv'
    file <- file.path(tempdir(), basename(url))
    download.file(url, destfile = file, mode = 'wb')
    dt <- fread(file)
    dt$File.Name %<>% factor()
    levels(dt$File.Name) %<>% substr(nchar(.)-2, nchar(.)-2)
    levels(dt$File.Name) %<>% paste0('_', .)
    dt$File.Name %<>% as.character()
    
    dt %<>% extract(, c('Protein.Names', 'File.Name', 'Precursor.Id', 
                        'Precursor.Normalised', 'Precursor.Quantity', 'PG.Quantity'), with = FALSE)
    dt %<>% extract('EIF3J_HUMAN', on = 'Protein.Names')
    dt %<>% extract(c('_0', '_3', '_6', '_9'), on = 'File.Name')
    
    dt$Precursor.Quantity %<>% as.numeric()
    dt$Precursor.Normalised %<>% as.numeric()
    dt$PG.Quantity %<>% as.numeric()

    dt %<>% extract(order(Protein.Names, File.Name, -Precursor.Quantity))
    dt[, PG.Top1 := max(Precursor.Quantity), by = c('Protein.Names', 'File.Name')]
    dt    

image

bhagwataditya avatar Sep 21 '22 11:09 bhagwataditya

I am not aware of any scenarios when MaxLFQ cannot be used. Perfectly fine for a dilution series. If you don't want normalisation, you can turn if off in the GUI, and MaxLFQ will stiull work just fine, just will not be normalised.

PG.Quantity should be Top 1 100%. If it isn't - means the output is filtered in such a way that some precursor are not shown (don't pass some threshold), but are still used to calculate PG.Quantity. Also, you can recalculate it in whatever way you prefer using diann_matrix() function of the diann R package.

Best, Vadim

vdemichev avatar Sep 29 '22 12:09 vdemichev