scanpy Why making feature names unique instead of aggregation?

Why making feature names unique instead of aggregation?

Open VladimirShitov opened this issue 5 months ago • 6 comments

What kind of feature would you like to request?

Additional function parameters / changed functionality / changed defaults?

Please describe your wishes

Hi scanpy team! I have a rather conceptual question. Since the beginning of the single-cell analysis era, one of the standard steps in preprocessing is making the feature names unique (e.g. with adata.var_names_make_unique()) by adding suffixes to their names. It is recommended in the scanpy tutorial and in the best practices book. It is clear how identical feature names make the following data processing challenging, but why are we handling it this way? Wouldn't it make more sense to aggregate features with identical names, summing the counts? From the biological point of view, the same gene name means the same feature, so why split it into several features and corrupt their names?

Sep 18 '24 09:09 VladimirShitov

scanpy scanpy copied to clipboard

Why making feature names unique instead of aggregation?

What kind of feature would you like to request?

Please describe your wishes

scanpy
scanpy copied to clipboard