immunarch icon indicating copy to clipboard operation
immunarch copied to clipboard

Issues with repDiversity

Open fabio-t opened this issue 3 years ago • 2 comments

I've been using repDiversity and noticed several issues, which I'm putting all here together for simplicity:

documentation issues

  1. .max.q default is actually 6, not 5:
.max.q | The max hill number to calculate (default: 5).
  1. "true diversity" is just the "hill q", is that correct? That is, using hill's formula, if we set the parameter .q=3 then "true diversity" is equal to the value showed on the hill curve at x=3. If this is correct, it should be better stated in the documentation, possibly.

bugs

  1. when using .method="d50", inside the plot I actually see:
D51 diversity index D51 diversity index
Number of clonotypes occupying the 51% of repertoires

The same happens if I manually try to specify .perc=50.

  1. with .method="raref", the "Sample size" stops too early, I'm not sure why. For example, the samples in my repertoire have 200-300 clonotypes each, summing up to thousands of clones (UMIs, in my case). When running raref, the x-axis goes only as far as 30 to 40 and estimate diversity is in the 1-2 range. What is the logic behind this?

fabio-t avatar Aug 18 '21 14:08 fabio-t

Ah, regarding raref I got it.. it's the normalisation step. To be honest it's not really clear to me what kind of normalisation is being performed and why. Can this be expanded a bit in the documentation? Does the .laplace parameter matter for rarefaction curves?

fabio-t avatar Aug 18 '21 14:08 fabio-t

Hi Fabio, noted, thank you!

vadimnazarov avatar Aug 26 '21 08:08 vadimnazarov