seurat-object icon indicating copy to clipboard operation
seurat-object copied to clipboard

SaveSeuratRds(): Is there a way to alter directory address of 'on-disk' matrices in Seurat object

Open Dazcam opened this issue 10 months ago • 3 comments

Hello,

I have been running an analysis in Seurat 5 that is partially run on a local machine and a remote server, and I've been trying to work out how to change the address of where the BP cell generated count matrices are located. After moving Seurat objects generated remotely to my local machine, I encounter errors when trying to run certain functions as the root directory for the count data is set to the remote directory, rather than the local directory. See here for more details.

The SaveSeuratRds() function looks as if it may be able to change this directory address, but there does not appear to be functionality to just change it without moving the on-disk layers from their original location. I've tried running SaveSeuratRds() with move = F, but when you reload the object the layers are missing:

> seurat_object
An object of class Seurat 
53590 features across 144380 samples within 2 assays 
Active assay: RNA (26795 features, 2000 variable features)
 0 layers present: 
 1 other assay present: sketch
 6 dimensional reductions calculated: pca, umap, harmony, umap.harmony, harmony.full, umap.full

When running SaveSeuratRds() with move = T, obviously the original path is not found (this also occurs when setting relative = T or relative = F).

SaveSeuratRds(seurat_object, paste0(R_dir, '02seurat_', region, '_test.rds'))
Error:
! Can't find path:
...

I've also tried digging around the Seurat Object, but I can't put my finger where this address is stored. The best I could do was find a list of 28 identical items containing the following, but I'm not convinced this is worth changing as it looks like a log:

Matrix_list
seurat_object@assays$RNA@layers$counts@matrix@matrix@matrix_list[[1]]
26795 x 5706 IterableMatrix object with class RenameDims

Row names: TTR, LINC01821 ... PARVG
Col names: 10X356_4:GGTGAAGCAGGTGACA, 10X356_4:TGGATGTCACGACAAG ... 10X356_4:AGTGATCAGGCCCAAA

Data type: double
Storage order: column major

Queued Operations:
1. Load compressed matrix from directory /scratch/results/01R_objects/CBL_BP
2. Select rows: 1, 5 ... 59357 and cols: 1, 2 ... 28010
3. Reset dimnames
4. Reset dimnames
5. Reset dimnames
6. Reset dimnames
7. Reset dimnames
8. Reset dimnames
9. Reset dimnames
10. Reset dimnames
11. Reset dimnames

So a couple of questions then:

  1. Is there a way to alter the address of the root directory within Seurat, wither using SaveSeuratRds() or otherwise?
  2. If not, could this functionality be added to SaveSeuratRds() to handle local / remote analyses?

Many thanks.

Dazcam avatar Apr 09 '24 10:04 Dazcam

Hi Dazcam, here's an example of where the directory address is stored in the Seurat V5 object. Here you can see an example path of 1 of 3 joined datasets in this BPCells Seurat object. It is stored in BP_object@assays[["RNA"]]@layers[["counts"]]@matrix@matrix_list[[1]]@matrix@dir, where 1 represents the first of the joined layers.

You change the store file path for each of the layers like this:

> BP_object@assays[["RNA"]]@layers[["counts"]]@matrix@matrix_list[[1]]@matrix@dir
[1] "/path/to/your/old/dir"
> BP_object@assays[["RNA"]]@layers[["counts"]]@matrix@matrix_list[[1]]@matrix@dir <- "/path/to/your/new/dir"
> BP_object@assays[["RNA"]]@layers[["counts"]]@matrix@matrix_list[[1]]@matrix@dir
[1] "/path/to/your/new/dir"

I'm also curious if you know how to save and then load the saved joined object as a Seurat object again?

Screenshot 2024-07-04 at 2 35 44 AM

jvelghe avatar Jul 04 '24 09:07 jvelghe

Hi @jvelghe,

Many Thanks for this. I'll give it a go.

Regarding your question, if I understand your question correctly, I use the following for saving and loading data:

  • Save: saveRDS(seurat_object, paste0(R_dir, '02seurat_', region, '.rds'))
  • Read: seurat_object <- readRDS(paste0(R_dir, '02seurat_', region, '.rds'))

Dazcam avatar Jul 04 '24 10:07 Dazcam

@jvelghe The directory name of the BP cells object must be stored in multiple places. After changing the location (as you describe) certain procedures, like trying to convert the 'in memory' matrix back to an 'on disk' matrix, Seurat still reports the old directory.

 seurat_obj[["RNA"]]$counts
#> 27379 x 66782 IterableMatrix object with class RenameDims

#> Row names: ABCA13, PENK-AS1 ... SLC7A7
#> Col names: 10X318_7:GGGTTTAGTTACGATC, 10X318_8:CCCGGAAGTGACTGAG ... 10X145_3:AACAGGGCAGCCGTCA

#> Data type: double
#> Storage order: column major

#> Queued Operations:
#> 1. Concatenate cols of 12 matrix objects with classes: RenameDims, RenameDims ... RenameDims (threads=0)
#> 2. Select rows: 1, 2 ... 27379 and cols: 1, 5345 ... 49485
#> 3. Reset dimnames

> as(object = seurat_obj[["RNA"]]$counts, Class = "dgCMatrix")
#> Error: Missing directory: /scratch/c.cXXXXXX/results/01R_objects/CaB_BP

> seurat_obj@assays[["RNA"]]@layers[["counts"]]@matrix@matrix@matrix_list[[1]]@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@dir
#> [1] "/scratch/c.cXXXXXX/results/01R_objects/CaB_BP"

> seurat_obj@assays[["RNA"]]@layers[["counts"]]@matrix@matrix@matrix_list[[1]]
#> 27379 x 5344 IterableMatrix object with class RenameDims

#> Row names: ABCA13, PENK-AS1 ... SLC7A7
#> Col names: 10X318_7:GGGTTTAGTTACGATC, 10X318_7:TGTGTGAGTTCCGCTT ... 10X318_7:GGGCTCATCCACAGGC

#> Data type: double
#> Storage order: column major

#> Queued Operations:
#> 1. Load compressed matrix from directory /scratch/c.cXXXXXX/results/01R_objects/CaB_BP
#> 2. Select rows: 1, 3 ... 59357 and cols: 1, 3 ... 32673
#> 3. Reset dimnames
#> 4. Reset dimnames
#> 5. Reset dimnames
#> 6. Reset dimnames
#> 7. Reset dimnames
#> 8. Reset dimnames
#> 9. Reset dimnames
#> 10. Reset dimnames
#> 11. Reset dimnames

> seurat_obj@assays[["RNA"]]@layers[["counts"]]@matrix@matrix@matrix_list[[1]]@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@dir <- '/Users/XXXXXX/Desktop/results/01R_objects/CaB_BP'

> seurat_obj@assays[["RNA"]]@layers[["counts"]]@matrix@matrix@matrix_list[[1]]
#> 27379 x 5344 IterableMatrix object with class RenameDims

#> Row names: ABCA13, PENK-AS1 ... SLC7A7
#> Col names: 10X318_7:GGGTTTAGTTACGATC, 10X318_7:TGTGTGAGTTCCGCTT ... 10X318_7:GGGCTCATCCACAGGC

#> Data type: double
#> Storage order: column major

#> Queued Operations:
#> 1. Load compressed matrix from directory /Users/XXXXXX/Desktop/results/01R_objects/CaB_BP
#> 2. Select rows: 1, 3 ... 59357 and cols: 1, 3 ... 32673
#> 3. Reset dimnames
#> 4. Reset dimnames
#> 5. Reset dimnames
#> 6. Reset dimnames
#> 7. Reset dimnames
#> 8. Reset dimnames
#> 9. Reset dimnames
#> 10. Reset dimnames
#> 11. Reset dimnames

> as(object = seurat_obj[["RNA"]]$counts, Class = "dgCMatrix")
Error: Missing directory: /scratch/c.cXXXXXX/results/01R_objects/CaB_BP

Dazcam avatar Jul 25 '24 14:07 Dazcam