scanpy icon indicating copy to clipboard operation
scanpy copied to clipboard

log1p warns adata.X is logged when it may not be (when other layers are logged)

Open gheimberg opened this issue 5 years ago • 5 comments
trafficstars

When I use sc.pp.log1p(adata) and then sc.pp.log1p(adata, layer='other') it warns me that the data has already been logged even though I am logging a layer as opposed to adata.X.

Would be nice to flag logging for each layer instead of when anything is logged.

import scanpy as sc

adata = sc.datasets.pbmc3k_processed()
adata.layers['other'] = adata.X
sc.pp.log1p(adata, layer='other')
sc.pp.log1p(adata)
WARNING: adata.X seems to be already log-transformed.

Versions:

scanpy==1.5.2.dev5+ge5d246aa anndata==0.7.3 umap==0.3.10 numpy==1.18.5 scipy==1.5.0 pandas==1.0.5 scikit-learn==0.23.1 statsmodels==0.11.1 python-igraph==0.7.1 louvain==0.6.1 leidenalg==0.7.0

gheimberg avatar Jul 27 '20 22:07 gheimberg

Happens here:

https://github.com/theislab/scanpy/blob/3558a42e747856cbf55c4d118566a155c6717178/scanpy/preprocessing/_simple.py#L286-L287

Where does .uns['log1p'] get set other than there?

flying-sheep avatar Jul 28 '20 08:07 flying-sheep

Hi @gheimberg,

In your example you are not using a deepcopy to assign adata.X to adata.layers['other']. So when you log transform the data in the layer, it automatically log transforms the data in adata.X as well, as you just passed the reference. That being said, this is still a bug as even with a adata.X.copy() the warning is given.

LuckyMD avatar Jul 28 '20 09:07 LuckyMD

Guys we should just keep the layer info here in log1p:

data.uns['log1p'] = {'base': base}

like

data.uns['log1p'][layer] = {'base': base}

gokceneraslan avatar Jul 28 '20 12:07 gokceneraslan

I've come across a strange behavior related with this issue. Depending on whether or not I save the object I get the same warning as OP.

This works as it should:

import scanpy as sc

adata=sc.read_h5ad(data_dir+'scanpy_QC_sexchrom.h5ad')
adata.raw=adata.copy() #data to save
sc.pp.log1p(adata) # logaritmize

### Test 1, no saving, works as it should
adata=adata.raw.to_adata()
sc.pp.log1p(adata)
##>>> no warning

Saving mid-way does not allow to avoid the warning, even restarting the kernel before reading the data:

import scanpy as sc

## same as above
adata=sc.read_h5ad(data_dir+'scanpy_QC_sexchrom.h5ad')
adata.raw=adata.copy() #data to save
sc.pp.log1p(adata) # logaritmize

### Test 2, saving and re-assigning from raw
### saving object, reading, testing again
### Doesnt work
adata.write_h5ad(tmp+'scanpy_test.h5ad')
adata=sc.read_h5ad(tmp+'scanpy_test.h5ad')
adata=adata.raw.to_adata()
sc.pp.log1p(adata)
###>>>WARNING: adata.X seems to be already log-transformed.

I'm on scanpy 1.9.1 if it matters

Benfeitas avatar Aug 09 '22 14:08 Benfeitas

I must also mention that upon reading in the data:

  • running adata.uns['log1p'] returns {};
  • setting adata.uns['log1p']["base"] = None after reading doesn't help.
  • running del adata.uns['log1p'] solves the problem. Visual inspection of expression values in adata.X seem to not be log-transformed.

Benfeitas avatar Aug 10 '22 11:08 Benfeitas