scanpy icon indicating copy to clipboard operation
scanpy copied to clipboard

n_bins not respected in highly_variable_genes(..., flavour='cell_ranger')

Open flying-sheep opened this issue 5 years ago • 3 comments

This code creates the range [10, 15, …, 100], which is 19 values. Taking those percentiles together with -inf and +inf creates 20 bins (from 21 bin borders), understood.

https://github.com/theislab/scanpy/blob/e6e08e51d63c78581bb9c86fe6e302b80baef623/scanpy/preprocessing/_highly_variable_genes.py#L90

But 1. why start at 10% and 2. why is n_bins ignored and there are always 20 bins created?

@Koncopd @falexwolf you introduced this in #330, is this what cellranger does?

flying-sheep avatar Oct 24 '19 14:10 flying-sheep

We also tried to address this in #572 and #624 but neither of those PRs have been merged since they're not fully backwards compatible.

davidhbrann avatar Oct 30 '19 02:10 davidhbrann

Well, we really need to get that release out of the door.

But on the other hand, as @gokceneraslan said the current behavior is a bug, so we can change it now!

Gökçen, do you still plan on consolidating those PRs?

flying-sheep avatar Oct 30 '19 10:10 flying-sheep

5 years later, this is still not fixed … where is the decision recorded that they shouldn’t be merged?

flying-sheep avatar Jan 19 '24 12:01 flying-sheep