MultiVelo
MultiVelo copied to clipboard
Integrating several samples, vanishing genes :)
Hi there. First, thanks for this very nice tool :) I have a small probleme (it's probably trivial but I'm very new to bioinformatic analysis): I have 4 differents 10X multiom ATAC+RNA samples : PG2, PG6 PG24 and PG13. I integrated those using seurat/signac first then I tried to run Multivelo I treated the different samples separately for the preprocessing steps : For example for PG2 :
adata_atacPG2 = sc.read_10x_mtx('/media/david/F/yard/apps/cellranger-arc-2.0.2/PG2/outs/filtered_feature_bc_matrix/', var_names='gene_symbols', cache=True, gex_only=False) adata_atacPG2 = adata_atacPG2[:,adata_atacPG2.var['feature_types'] == "Peaks"] adata_atacPG2 = mv.aggregate_peaks_10x(adata_atacPG2,'/media/david/F/yard/apps/cellranger-arc-2.0.2/PG2/outs/atac_peak_annotation.tsv', '/media/david/F/yard/apps/cellranger-arc-2.0.2/PG2/outs/analysis/feature_linkage/feature_linkage.bedpe',verbose=True) mv.tfidf_norm(adata_atacPG2)
I renamed the cells with unique barcodes:
barcodes = adata_atacPG2.obs.index barcodesnew = ['PG2_' + bc[0:len(bc)-2] for bc in barcodes] adata_atacPG2.obs.index = barcodesnew
Having done that on the four samples, I generated a single object by concatenation :
adata_atacPG = adata_atacPG2.concatenate([adata_atacPG6, adata_atacPG24, adata_atacPG13])
Then I processed the RNA and so one. Everything seems to work very nicely BUT I lose some of the genes (and some important ones that is). After investigation, I realized that these genes were lost during the concatenation step, probably because these are specifically present in some of the adata_atacPGXX objects but not in every ones.
How may I tackle this problem ?
Best David