cohorts
cohorts copied to clipboard
Utilities for analyzing mutations and neoepitopes in patient cohorts
From @iskandr re https://github.com/hammerlab/mhctools/pull/86: > Followup to mhctools PR, a much smaller change to make Topiary work: https://github.com/hammerlab/topiary/pull/66 > The only API changes that should matter to you are `epitope_lengths`...
@jburos wrote some useful utilities in https://github.com/hammerlab/rcc-analyses/blob/master/analyses/epidisco.py and https://github.com/hammerlab/ovarian-msk-chemo/blob/master/epidisco/epidisco.py (the latter contains some additional utilities, I believe) that we should factor out into public code. One idea is a `discohorts`...
Per https://github.com/hammerlab/cohorts/pull/179#issuecomment-273334798
Per https://github.com/hammerlab/cohorts/pull/177#discussion_r96321375
Will require a test BAM.
From @tavinathanson: > I'm fairly unclear right now about how best, when using cohorts, to use epidisco results; for example using epidisco neoepitopes rather than https://github.com/hammerlab/cohorts/blob/master/cohorts/cohort.py#L767 > > One obstacle...
Currently, regular neoantigens use `topiary` while expressed neoantigens use `isovar` + `mhctools` because `topiary` doesn't use isovar. This depends on `topiary` making use of `isovar`, I think. Related: https://github.com/hammerlab/vaxrank/issues/31
Would be useful to keep track of `batch` identifiers so we can estimate batch effects as part of standard protocol.
Maybe out of scope for this project, but for TCGA data it would be useful to support PFS or OS being set to NaN. Right now, this gives an error...