scanpy icon indicating copy to clipboard operation
scanpy copied to clipboard

sc.pp.scale changes adata.raw.X

Open jpagolia opened this issue 5 months ago • 4 comments

Please make sure these conditions are met

  • [x] I have checked that this issue has not already been reported.
  • [x] I have confirmed this bug exists on the latest version of scanpy.
  • [ ] (optional) I have confirmed this bug exists on the main branch of scanpy.

What happened?

Following this common workflow:

adata.layers["counts"] = adata.X.copy()
sc.pp.normalize_total(adata)
sc.pp.log1p(adata)
adata.layers['lognorm'] = adata.X.copy()
adata.raw = adata # full dimension lognormalized data
sc.pp.scale(adata, max_value=10)
adata

If you check adata.X, adata.layers['counts'], and adata.layers['lognorm'], and adata.raw.X, you will find that adata.X and adata.raw.X are the same. The desired behavior would probably be for adata.raw.X to be the same as adata.layers['lognorm']. It appears that sc.pp.scale is changing adata.raw. Why is that?

Minimal code sample

adata.layers["counts"] = adata.X.copy()
sc.pp.normalize_total(adata)
sc.pp.log1p(adata)
adata.raw = adata
sc.pp.scale(adata, max_value=10)
adata
adata.layers['lognorm']
array([[0.       , 0.       , 0.       , ..., 0.       , 0.       ,
        0.       ],
       [0.       , 0.       , 1.4028237, ..., 0.       , 0.       ,
        0.       ],
       [0.       , 0.       , 0.       , ..., 0.       , 0.       ,
        0.       ],
       ...,
       [0.       , 0.       , 0.       , ..., 0.       , 0.       ,
        0.       ],
       [0.       , 0.       , 0.       , ..., 0.       , 0.       ,
        0.       ],
       [0.       , 0.       , 0.       , ..., 0.       , 0.       ,
        0.       ]], shape=(70499, 309), dtype=float32)
adata.X
array([[-0.2976397 , -0.35878736, -0.2131979 , ..., -0.14714538,
        -0.32566202, -0.3301082 ],
       [-0.2976397 , -0.35878736,  8.037066  , ..., -0.14714538,
        -0.32566202, -0.3301082 ],
       [-0.2976397 , -0.35878736, -0.2131979 , ..., -0.14714538,
        -0.32566202, -0.3301082 ],
       ...,
       [-0.2976397 , -0.35878736, -0.2131979 , ..., -0.14714538,
        -0.32566202, -0.3301082 ],
       [-0.2976397 , -0.35878736, -0.2131979 , ..., -0.14714538,
        -0.32566202, -0.3301082 ],
       [-0.2976397 , -0.35878736, -0.2131979 , ..., -0.14714538,
        -0.32566202, -0.3301082 ]], shape=(70499, 309), dtype=float32)
adata.raw.X
array([[-0.2976397 , -0.35878736, -0.2131979 , ..., -0.14714538,
        -0.32566202, -0.3301082 ],
       [-0.2976397 , -0.35878736,  8.037066  , ..., -0.14714538,
        -0.32566202, -0.3301082 ],
       [-0.2976397 , -0.35878736, -0.2131979 , ..., -0.14714538,
        -0.32566202, -0.3301082 ],
       ...,
       [-0.2976397 , -0.35878736, -0.2131979 , ..., -0.14714538,
        -0.32566202, -0.3301082 ],
       [-0.2976397 , -0.35878736, -0.2131979 , ..., -0.14714538,
        -0.32566202, -0.3301082 ],
       [-0.2976397 , -0.35878736, -0.2131979 , ..., -0.14714538,
        -0.32566202, -0.3301082 ]], shape=(70499, 309), dtype=float32)

Versions

scanpy: 1.11.4

jpagolia avatar Jul 31 '25 02:07 jpagolia

Hi, you checked “I have confirmed this bug exists on the latest version of scanpy.”, but 1.10.2 is far from the latest version of scanpy. Can you please check with 1.11.4?

flying-sheep avatar Jul 31 '25 07:07 flying-sheep

Sorry about that, @flying-sheep . I installed the latest version of scanpy and double checked, and the issue still exists (pasted the output above). The answer for me for now is just to use layers for everything, but I wanted to post this as the behavior is a bit unexpected and could affect analysis results without it being recognized.

jpagolia avatar Aug 02 '25 04:08 jpagolia

hello! i am looking to contribute to scanpy and thought this might be an active and useful issue to resolve. i've used scanpy quite a bit in the past. @flying-sheep please let me know if this issue is a good tackle as a first contribution to scanpy, and if not this one, then which other issue would be better! looking to start on this today or tomorrow

meermustafa avatar Aug 27 '25 17:08 meermustafa

I think this is actually not a scanpy but an anndata bug. When you create raw. Youre only creating a link/ pointer to the same array/matrix. So when you update .X raw also updates because it points to the same array.

To avoid this behavior: adata.raw = adata.copy()

Intron7 avatar Aug 29 '25 10:08 Intron7