azimuth
azimuth copied to clipboard
RunAzimuth MappingScore(anchors = anchors, ndim = dims) error
Hi,
First of all thank for this package, it looks promising however with my data I get an error when running RunAzimuth. I used an integrated seurat object that was SCtransformed and doublets removed. When running ''
RunAzimuth(seurat.obj, reference = "bonemarrowref") Error in
.rowNamesDF<-
(x, value = value) : duplicate 'row.names' are not allowed In addition: Warning message: non-unique values when setting 'row.names': 'AAAGCAAGTTGGTTTG-1_26', 'AACGTTGAGAAAGTGG-1_6', 'AACTCAGGTTTAGCTG-1_21', 'AACTTTCAGACTAGAT-1_13', 'AAGGAGCGTGCAACGA-1_26', 'AAGGAGCTCTTGCCGT-1_26', 'AAGTCTGAGCCAGTAG-1_13', 'AATCCAGGTACCCAAT-1_15', 'AATCGGTCAATCACAC-1_27', 'ACAGCTACACAGCCCA-1_8', 'ACATACGAGCACGCCT-1_26', 'ACATACGGTAGCCTCG-1_17', 'ACATCAGCACAGGTTT-1_24', 'ACATCAGGTTTAGCTG-1_13', 'ACCCACTAGCAGCGTA-1_26', 'ACGAGCCTCGCAGGCT-1_18', 'ACGAGGAGTAGCTGCC-1_27', 'ACGATACAGATCTGAA-1_11', 'ACGCCGATCGTTTGCC-1_8', 'ACGGAGAGTGCACGAA-1_9', 'ACGGCCATCTCATTCA-1_16', 'ACGGGCTAGACTGGGT-1_13', 'ACGGGCTGTGCAGTAG-1_23', 'ACGTCAATCATTCACT-1_25', 'ACGTCAATCTCGTTTA-1_24', 'ACTGAGTTCTAACTTC-1_25', 'ACTGCTCCACGTTGGC-1_15', 'ACTGCTCTCCTTGACC-1_25', 'ACTTACTGTAAATGTG-1_20', 'ACTTACTGTACGACCC-1_26', 'AGATTGCTCGGATGGA-1_9', 'AGCCTAATCGTTTAGG-1_22', 'AGCGGTCAGATGCCAG-1_21', 'AGCGGTCGTTGTCTTT-1_15', 'AGCGTCGTCCTCTAGC-1_13', 'AGCTCCTCAGGCTGAA-1_18', 'AGGCCGTGTATATGGA-1_6', 'AGGCCGTGTGTTTGGT-1_3', 'AGGGAGTAGTTG [... truncated] ''
I get this error that is coming from the Calculate mapping score and add to metadata step and especially the MappingScore(anchors = anchors, ndim = dims) with dims being 50 and coming from dims <- as.double(length(slot(reference, "reductions")$refDR))
When I looked at query object used in the function or in my original seurat object, I don't have duplicate row.names
''
query An object of class Seurat 64126 features across 66311 samples within 5 assays Active assay: refAssay (21793 features, 3000 variable features) 4 other assays present: RNA, SCT, prediction.score.celltype.l2, prediction.score.celltype.l1 4 dimensional reductions calculated: pca, umap, integrated_dr, ref.umap
length(row.names([email protected])) [1] 66311 length(unique(row.names([email protected]))) [1] 66311 sum(duplicated(row.names([email protected]))) [1] 0 ''
''
length(unique(row.names([email protected]))) [1] 66311 sum(duplicated(row.names([email protected]))) [1] 0 ''
Any idea how to solve this problem?
Thanks
Same question.
I found the question. In the MappingScore function, there is a RenamCells step. It will combine the query cells and ref cells. However, the combined query cell and ref cell barcode will generate duplicated cell barcode.
I think, when building Azimuth reference, they should replace raw reference data cell barcode by some watermark. For example, "AZIMUTHPBMCACCTACCAGCCTATCA-1". This will avoid duplicated reference cell barcode with new query cell barcode.
Hi, I have same question, and thank you @chunjie-sam-liu point the issue, I workaround the issue with the below code:
# install reference data
options(timeout = 300) # see https://github.com/satijalab/seurat-data/issues/46
InstallData("bonemarrowref") # reference = "bonemarrowref"
# load reference data to R
library(bonemarrowref.SeuratData)
bonemarrowref <- LoadData("bonemarrowref", "azimuth")
# rename reference cell
add_name <- function(x){
return(paste0("REF:",x))
}
ref_map <- bonemarrowref$map
ref_map <- RenameCells(obj=ref_map, new.names = unname(obj = sapply(X = Cells(x = ref_map),
FUN = add_name)))
# create a local folder for ref.Rds and idx.annoy
dir.create("bonemarrowref_2")
saveRDS(ref_map, file = "./bonemarrowref_2/ref.Rds")
SaveAnnoyIndex(ref_map@neighbors$refdr.annoy.neighbors, file="./bonemarrowref_2/idx.annoy")
After the process, it can do annotation from local reference ./bonemarrowref_2
mydata <- RunAzimuth(mydata,
reference = "bonemarrowref_2"
)