seqc Duplicate gene names in sparse count matrix

Duplicate gene names in sparse count matrix

Open vincent6liu opened this issue 4 years ago • 0 comments

Since some times multiple ENSEMBL IDs correspond to a single gene name, there can be columns with the same gene name in the sparse count matrix (ie. entries in _sparse_counts_genes.csv are not unique). Not sure how this is handled in the filtered dense matrix. Might be good to add some suffix to duplicated gene names matching different ENSEMBL IDs, something like WDFY4 (1), WDFY4 (2).

Dec 30 '19 16:12 vincent6liu

seqc seqc copied to clipboard

Duplicate gene names in sparse count matrix

seqc
seqc copied to clipboard