STREAM icon indicating copy to clipboard operation
STREAM copied to clipboard

Seurat object

Open wajm opened this issue 4 years ago • 26 comments

Hi, I have precomputed seurat objects. Can I use these objects? or any example?

wajm avatar Dec 09 '20 07:12 wajm

Hi thanks for trying STREAM. To use Seurat object,

1)save seurat object as a loom file in R.

your_object.loom <- as.loom(your_object, filename = "./your_object.loom", verbose = FALSE)
pbmc.loom$close_all()

2)set necessary variables in STREAM :

import anndata as ad
import stream as st

adata = ad.read_loom('"./your_object.loom")

st.set_workdir(adata,'./stream_result')
st.add_cell_labels(adata)
st.add_cell_colors(adata)
adata.obsm['top_pcs'] = adata.obsm['pca_cell_embeddings']
adata.obsm['X_vis_umap'] = adata.obsm['umap_cell_embeddings']

then you can either learn graphical structure directly on UMAP from seurat:

st.plot_visualization_2D(adata,method='umap',n_neighbors=50,color=['label'])
st.seed_elastic_principal_graph(adata,n_clusters=10,use_vis=True)
st.elastic_principal_graph(adata,epg_alpha=0.01,epg_mu=0.05,epg_lambda=0.05)
st.plot_dimension_reduction(adata,color=['label'],n_components=2,show_graph=True,show_text=False)
st.plot_branches(adata,show_text=False)

or start from the step of dimension reduction as in STREAM tutorial

st.dimension_reduction(adata,method='se',feature='top_pcs',n_components=2,n_neighbors=15,n_jobs=4)
st.plot_dimension_reduction(adata,color=['label'],n_components=2,show_graph=False,show_text=False)

huidongchen avatar Dec 09 '20 15:12 huidongchen

this worked for me after changing adata.obsm['X_vis_umap'] to adata.obsm['X_dr']. Thanks! also the docker version lacks loompy installation.

crazyhottommy avatar Dec 16 '20 13:12 crazyhottommy

Thanks very much for the feedbacks, Ming! I will certainly keep them in mind.

Sorry about the issue related to adata.obsm['X_vis_umap'] . I made a mistake in the above code snippet.

Changing adata.obsm['X_vis_umap'] to adata.obsm['X_dr'] can be a potential solution. But if you want to infer trajectories directly on UMAP from seurat, you might need to do the following:

import anndata as ad
import stream as st

adata = ad.read_loom('"./your_object.loom")

###set necessary variables in STREAM :
st.set_workdir(adata,'./stream_result')
st.add_cell_labels(adata)
st.add_cell_colors(adata)
adata.obsm['top_pcs'] = adata.obsm['pca_cell_embeddings']
adata.obsm['X_dr'] = adata.obsm['umap_cell_embeddings']
adata.obsm['X_vis_umap'] = adata.obsm['umap_cell_embeddings'][:,:2]

###learn trajectories directly on UMAP from seurat
st.plot_visualization_2D(adata,method='umap',n_neighbors=50,color=['label'],use_precomputed=True)
st.seed_elastic_principal_graph(adata,n_clusters=10,use_vis=True)
st.elastic_principal_graph(adata,epg_alpha=0.01,epg_mu=0.05,epg_lambda=0.05)
st.plot_dimension_reduction(adata,color=['label'],n_components=2,show_graph=True,show_text=False)
st.plot_branches(adata,show_text=False)

huidongchen avatar Dec 16 '20 14:12 huidongchen

Thanks @huidongchen.

adata.obsm['X_dr'] = adata.obsm['umap_cell_embeddings']
adata.obsm['X_vis_umap'] = adata.obsm['umap_cell_embeddings'][:,:2]

X_dr is used for trajectory inference, X_vis_umap is for visualization right? I guessed from the names of the variables.

crazyhottommy avatar Dec 16 '20 16:12 crazyhottommy

X_dr indicates for the space after dimension reduction (the number of dimensions can be greater than 2) and X_vis indicates the 2D plane for UMAP/tSNE visualization.

But trajectory inference can be done in either space.

When you enabled use_vis=True in st.seed_elastic_principal_graph(), it will learn the trajectories using X_vis (in your case, it's the UMAP from Seurat. adata.obsm['X_dr'] will be ignored but to avoid the error from the step st.plot_visualization_2D(), you will need to specify it )

When use_vis=False(by default) in st.seed_elastic_principal_graph(), it will learn the trajectories using X_dr (adata.obsm['X_dr'] is obtained from the step st.dimension_reduction())

I hope the above is useful to you.

huidongchen avatar Dec 16 '20 16:12 huidongchen

thank you for kind information. btw, my seurat integration object make some problem. after making loom format,

adata.obsm['top_pcs'] = adata.obsm['pca_cell_embeddings'] adata.obsm['X_dr'] = adata.obsm['umap_cell_embeddings'] adata.obsm['X_vis_umap'] = adata.obsm['umap_cell_embeddings'][:,:2]

these are not working.

they cannot find both 'pca_cell_embeddings' and 'umap_cell_embeddings'. What are they?

wajm avatar Dec 17 '20 00:12 wajm

I just took a look. if you save the object coembed in R as follows:

coembed.loom <- as.loom(coembed, filename = "./coembed.loom", verbose = FALSE)
coembed.loom$close_all()

Both pca_cell_embeddings and umap_cell_embeddings should still be there and the same procedure above should still work.

huidongchen avatar Dec 17 '20 01:12 huidongchen

hi AnnData object with n_obs × n_vars = 51813 × 30644 obs: 'label', 'label_color' var: 'Selected', 'vst_mean', 'vst_variable', 'vst_variance', 'vst_variance_expected', 'vst_variance_standardized' uns: 'workdir', 'label_color' layers: 'norm_data' but, where are they?

wajm avatar Dec 17 '20 01:12 wajm

I assume the loom file you saved does not contain all the information needed.

Did you follow through the Seurat integration tutorial? If you did, when you save coembed to loom file, it should contain the fields pca_cell_embeddings and umap_cell_embeddings.

I tested it myself. It works for me.

huidongchen avatar Dec 17 '20 01:12 huidongchen

Sorry, I did reciprocal PCA integration. https://satijalab.org/seurat/v3.2/integration.html

wajm avatar Dec 17 '20 01:12 wajm

okay, I solve the problem. my integration object was changed. DefaultAssay(patient) was RNA. So I return DefaultAssay to "integrated". So sorry and Thank you very much.

wajm avatar Dec 17 '20 02:12 wajm

okay, I solve the problem. my integration object was changed. DefaultAssay(patient) was RNA. So I return DefaultAssay to "integrated". So sorry and Thank you very much.

No worries at all! Thanks for letting us know. I'm glad you figured it out.

huidongchen avatar Dec 17 '20 02:12 huidongchen

Hi Huidong, I went on to do marker detection at each leaf:

st.detect_leaf_markers(adata,cutoff_zscore=1.0,cutoff_pvalue=0.01,
                       root='S3',n_jobs=4)

I got this error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/managers.py in create_block_manager_from_blocks(blocks, axes)
   1670                 blocks = [
-> 1671                     make_block(values=blocks[0], placement=slice(0, len(axes[0])))
   1672                 ]

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/blocks.py in make_block(values, placement, klass, ndim, dtype)
   2743 
-> 2744     return klass(values, ndim=ndim, placement=placement)
   2745 

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim)
   2399 
-> 2400         super().__init__(values, ndim=ndim, placement=placement)
   2401 

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim)
    130             raise ValueError(
--> 131                 f"Wrong number of items passed {len(self.values)}, "
    132                 f"placement implies {len(self.mgr_locs)}"

ValueError: Wrong number of items passed 1, placement implies 33538

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-177-48ce301bf716> in <module>
      1 st.detect_leaf_markers(adata,cutoff_zscore=1.0,cutoff_pvalue=0.01,
----> 2                        root='S3',n_jobs=4)

~/anaconda3/envs/stream/lib/python3.6/site-packages/stream/core.py in detect_leaf_markers(adata, marker_list, cutoff_zscore, cutoff_pvalue, percentile_expr, n_jobs, min_num_cells, use_precomputed, root, preference)
   3796         df_sc = pd.DataFrame(index= adata.obs_names.tolist(),
   3797                              data = adata[:,input_markers].X,
-> 3798                              columns=input_markers)
   3799         #exclude markers that are expressed in fewer than min_num_cells cells
   3800         print("Filtering out markers that are expressed in less than " + str(min_num_cells) + " cells ...")

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    521                     mgr = arrays_to_mgr(arrays, columns, index, columns, dtype=dtype)
    522                 else:
--> 523                     mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
    524             else:
    525                 mgr = init_dict({}, index, columns, dtype=dtype)

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/construction.py in init_ndarray(values, index, columns, dtype, copy)
    232         block_values = [values]
    233 
--> 234     return create_block_manager_from_blocks(block_values, [columns, index])
    235 
    236 

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/managers.py in create_block_manager_from_blocks(blocks, axes)
   1679         blocks = [getattr(b, "values", b) for b in blocks]
   1680         tot_items = sum(b.shape[0] for b in blocks)
-> 1681         raise construction_error(tot_items, blocks[0].shape[1:], axes, e)
   1682 
   1683 

ValueError: Shape of passed values is (2000, 1), indices imply (2000, 33538)

It seems it can not find the data matrix. How can I add that into adata?

crazyhottommy avatar Dec 17 '20 03:12 crazyhottommy

one more thing, in case of Importing precomputed seurat object, is it possible to make 3d plot? at the moment I can't.

wajm avatar Dec 17 '20 07:12 wajm

one more thing, in case of Importing precomputed seurat object, is it possible to make 3d plot? at the moment I can't.

To use the 3d umap result from Seurat, you simply need to set use_vis=False:

import anndata as ad
import stream as st

adata = ad.read_loom('"./your_object.loom")

###set necessary variables in STREAM :
st.set_workdir(adata,'./stream_result')
st.add_cell_labels(adata)
st.add_cell_colors(adata)
adata.obsm['top_pcs'] = adata.obsm['pca_cell_embeddings']
adata.obsm['X_dr'] = adata.obsm['umap_cell_embeddings']
adata.obsm['X_vis_umap'] = adata.obsm['umap_cell_embeddings'][:,:2]

###learn trajectories directly on 3D UMAP from seurat
st.seed_elastic_principal_graph(adata,n_clusters=10,use_vis=False)
st.elastic_principal_graph(adata,epg_alpha=0.01,epg_mu=0.05,epg_lambda=0.05)
st.plot_dimension_reduction(adata,color=['label'],n_components=3,show_graph=True,show_text=False,plotly=False)
st.plot_branches(adata,show_text=False)

huidongchen avatar Dec 17 '20 14:12 huidongchen

Hi Huidong, I went on to do marker detection at each leaf:

st.detect_leaf_markers(adata,cutoff_zscore=1.0,cutoff_pvalue=0.01,
                       root='S3',n_jobs=4)

I got this error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/managers.py in create_block_manager_from_blocks(blocks, axes)
   1670                 blocks = [
-> 1671                     make_block(values=blocks[0], placement=slice(0, len(axes[0])))
   1672                 ]

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/blocks.py in make_block(values, placement, klass, ndim, dtype)
   2743 
-> 2744     return klass(values, ndim=ndim, placement=placement)
   2745 

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim)
   2399 
-> 2400         super().__init__(values, ndim=ndim, placement=placement)
   2401 

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim)
    130             raise ValueError(
--> 131                 f"Wrong number of items passed {len(self.values)}, "
    132                 f"placement implies {len(self.mgr_locs)}"

ValueError: Wrong number of items passed 1, placement implies 33538

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-177-48ce301bf716> in <module>
      1 st.detect_leaf_markers(adata,cutoff_zscore=1.0,cutoff_pvalue=0.01,
----> 2                        root='S3',n_jobs=4)

~/anaconda3/envs/stream/lib/python3.6/site-packages/stream/core.py in detect_leaf_markers(adata, marker_list, cutoff_zscore, cutoff_pvalue, percentile_expr, n_jobs, min_num_cells, use_precomputed, root, preference)
   3796         df_sc = pd.DataFrame(index= adata.obs_names.tolist(),
   3797                              data = adata[:,input_markers].X,
-> 3798                              columns=input_markers)
   3799         #exclude markers that are expressed in fewer than min_num_cells cells
   3800         print("Filtering out markers that are expressed in less than " + str(min_num_cells) + " cells ...")

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    521                     mgr = arrays_to_mgr(arrays, columns, index, columns, dtype=dtype)
    522                 else:
--> 523                     mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
    524             else:
    525                 mgr = init_dict({}, index, columns, dtype=dtype)

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/construction.py in init_ndarray(values, index, columns, dtype, copy)
    232         block_values = [values]
    233 
--> 234     return create_block_manager_from_blocks(block_values, [columns, index])
    235 
    236 

~/anaconda3/envs/stream/lib/python3.6/site-packages/pandas/core/internals/managers.py in create_block_manager_from_blocks(blocks, axes)
   1679         blocks = [getattr(b, "values", b) for b in blocks]
   1680         tot_items = sum(b.shape[0] for b in blocks)
-> 1681         raise construction_error(tot_items, blocks[0].shape[1:], axes, e)
   1682 
   1683 

ValueError: Shape of passed values is (2000, 1), indices imply (2000, 33538)

It seems it can not find the data matrix. How can I add that into adata?

Unfortunately, in the current version of STREAM, sparse matrix is not supported yet for the marker detection. You will have to densify it first,

adata.X = adata.X.todense()
st.detect_leaf_markers(adata,cutoff_zscore=1.0,cutoff_pvalue=0.01,
                       root='S3',n_jobs=4)

This step can be slow. To speed it up, alternatively, you can only scan the variable genes or any specified list of genes by specifying marker_list:

adata.X = adata.X.todense()
st.detect_leaf_markers(adata,marker_list=adata.var[adata.var['Selected']>0].index, cutoff_zscore=1.0,cutoff_pvalue=0.01,
                       root='S3',n_jobs=4)

This issue will be solved in STREAM v2.

huidongchen avatar Dec 17 '20 14:12 huidongchen

@huidongchen Thanks! this worked for me.

crazyhottommy avatar Dec 18 '20 01:12 crazyhottommy

one more thing, in case of Importing precomputed seurat object, is it possible to make 3d plot? at the moment I can't.

To use the 3d umap result from Seurat, you simply need to set use_vis=False:

import anndata as ad
import stream as st

adata = ad.read_loom('"./your_object.loom")

###set necessary variables in STREAM :
st.set_workdir(adata,'./stream_result')
st.add_cell_labels(adata)
st.add_cell_colors(adata)
adata.obsm['top_pcs'] = adata.obsm['pca_cell_embeddings']
adata.obsm['X_dr'] = adata.obsm['umap_cell_embeddings']
adata.obsm['X_vis_umap'] = adata.obsm['umap_cell_embeddings'][:,:2]

###learn trajectories directly on 3D UMAP from seurat
st.seed_elastic_principal_graph(adata,n_clusters=10,use_vis=False)
st.elastic_principal_graph(adata,epg_alpha=0.01,epg_mu=0.05,epg_lambda=0.05)
st.plot_dimension_reduction(adata,color=['label'],n_components=3,show_graph=True,show_text=False,plotly=False)
st.plot_branches(adata,show_text=False)

ummm, This is not working. something wrong?

wajm avatar Dec 18 '20 01:12 wajm

one more thing, in case of Importing precomputed seurat object, is it possible to make 3d plot? at the moment I can't.

To use the 3d umap result from Seurat, you simply need to set use_vis=False:

import anndata as ad
import stream as st

adata = ad.read_loom('"./your_object.loom")

###set necessary variables in STREAM :
st.set_workdir(adata,'./stream_result')
st.add_cell_labels(adata)
st.add_cell_colors(adata)
adata.obsm['top_pcs'] = adata.obsm['pca_cell_embeddings']
adata.obsm['X_dr'] = adata.obsm['umap_cell_embeddings']
adata.obsm['X_vis_umap'] = adata.obsm['umap_cell_embeddings'][:,:2]

###learn trajectories directly on 3D UMAP from seurat
st.seed_elastic_principal_graph(adata,n_clusters=10,use_vis=False)
st.elastic_principal_graph(adata,epg_alpha=0.01,epg_mu=0.05,epg_lambda=0.05)
st.plot_dimension_reduction(adata,color=['label'],n_components=3,show_graph=True,show_text=False,plotly=False)
st.plot_branches(adata,show_text=False)

ummm, This is not working. something wrong?

What error did you see?

huidongchen avatar Dec 18 '20 03:12 huidongchen

same result with before 2D plot.

wajm avatar Dec 18 '20 03:12 wajm

same result with before 2D plot.

Sorry but I can't really help much with the current info. I don't know what you mean by 'same result'.

huidongchen avatar Dec 18 '20 03:12 huidongchen

same result with before 2D plot.

Sorry but I can't really help much with the current info. I don't know what you mean by 'same result'.

before use "use_vis=False" , I got a nice 2d plot following these. st.seed_elastic_principal_graph(adata,n_clusters=100,use_vis=True) st.elastic_principal_graph(adata,epg_alpha=0.01,epg_mu=0.2,epg_lambda=0.01) st.plot_dimension_reduction(adata,color=['label'],n_components=2,show_graph=True,show_text=True,fig_size=(10,8))

but after use "use_vis=False" , I got same plot, not 3D.

wajm avatar Dec 18 '20 03:12 wajm

same result with before 2D plot.

Sorry but I can't really help much with the current info. I don't know what you mean by 'same result'.

before use "use_vis=False" , I got a nice 2d plot following these. st.seed_elastic_principal_graph(adata,n_clusters=100,use_vis=True) st.elastic_principal_graph(adata,epg_alpha=0.01,epg_mu=0.2,epg_lambda=0.01) st.plot_dimension_reduction(adata,color=['label'],n_components=2,show_graph=True,show_text=True,fig_size=(10,8))

but after use "use_vis=False" , I got same plot, not 3D.

You need to set n_component=3. If you copy the code I shared previously, it should work. I have tested it.

huidongchen avatar Dec 18 '20 03:12 huidongchen

same result with before 2D plot.

Sorry but I can't really help much with the current info. I don't know what you mean by 'same result'.

before use "use_vis=False" , I got a nice 2d plot following these. st.seed_elastic_principal_graph(adata,n_clusters=100,use_vis=True) st.elastic_principal_graph(adata,epg_alpha=0.01,epg_mu=0.2,epg_lambda=0.01) st.plot_dimension_reduction(adata,color=['label'],n_components=2,show_graph=True,show_text=True,fig_size=(10,8)) but after use "use_vis=False" , I got same plot, not 3D.

You need to set n_component=3. If you copy the code I shared previously, it should work. I have tested it.

yes, I already try that, but message like that. n_components is greater than the available dimension. It is corrected to 2

wajm avatar Dec 18 '20 03:12 wajm

same result with before 2D plot.

Sorry but I can't really help much with the current info. I don't know what you mean by 'same result'.

before use "use_vis=False" , I got a nice 2d plot following these. st.seed_elastic_principal_graph(adata,n_clusters=100,use_vis=True) st.elastic_principal_graph(adata,epg_alpha=0.01,epg_mu=0.2,epg_lambda=0.01) st.plot_dimension_reduction(adata,color=['label'],n_components=2,show_graph=True,show_text=True,fig_size=(10,8)) but after use "use_vis=False" , I got same plot, not 3D.

You need to set n_component=3. If you copy the code I shared previously, it should work. I have tested it.

yes, I already try that, but message like that. n_components is greater than the available dimension. It is corrected to 2

Were you using three dimensions for UMAP in Seurat? You can check the umap coordinates you saved from Seurat (stored in adata.obsm['umap_cell_embeddings']). It seems you did not have 3D UMAP in Seurat.

huidongchen avatar Dec 18 '20 03:12 huidongchen

same result with before 2D plot.

Sorry but I can't really help much with the current info. I don't know what you mean by 'same result'.

before use "use_vis=False" , I got a nice 2d plot following these. st.seed_elastic_principal_graph(adata,n_clusters=100,use_vis=True) st.elastic_principal_graph(adata,epg_alpha=0.01,epg_mu=0.2,epg_lambda=0.01) st.plot_dimension_reduction(adata,color=['label'],n_components=2,show_graph=True,show_text=True,fig_size=(10,8)) but after use "use_vis=False" , I got same plot, not 3D.

You need to set n_component=3. If you copy the code I shared previously, it should work. I have tested it.

yes, I already try that, but message like that. n_components is greater than the available dimension. It is corrected to 2

Were you using three dimensions for UMAP in Seurat? You can check the umap coordinates you saved from Seurat (stored in adata.obsm['umap_cell_embeddings']). It seems you did not have 3D UMAP in Seurat.

okay I'll check again. Thanks.

wajm avatar Dec 18 '20 03:12 wajm