anndata
anndata copied to clipboard
Duplicates in index throw error during make_index_unique
My system: Ubuntu 23.10 Anndata version: '0.10.5.post1'
When dealing with anndata.var.index the function make_index_unique in utils, was constantly throwing a pandas related error:
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
By going into the utils file the error that appeared was
TypeError: 'Cannot setitem on a Categorical with a new category (X-1), set the categories first
It happened while attempting to remove the duplicates and creating a pd.Categorical in https://github.com/scverse/anndata/blob/d7643e966b7cfaf8f5c732f1f020b0674db1def9/src/anndata/utils.py#L244 and then attempting to add the tentative new name to a pd.Categorical without the categories which include the tentative new name.
In summary, when I change these lines to a more generic version, it solved the problem. Just wanted to share in case someone needs it
for example instead of values_dup = values[indices_dup]
use values_dup = np.array(values[indices_dup])
and replace Values with a copy of itself but containing the right categories, like this:
values=pd.Categorical(values,categories=list(values.categories)+list(values_dup))