seurat icon indicating copy to clipboard operation
seurat copied to clipboard

IntegrateData error: number of items to replace is not a multiple of replacement length

Open xinmiaoyan opened this issue 2 years ago • 6 comments

I used below code to integrate data by rPCA, but error occurred, I've tried many ways but still couldn't solve this problem. appreciate it if you could help.

##### Perform integration #######
sc.anchors <- FindIntegrationAnchors(object.list = sc.list, anchor.features = features, reduction = "rpca")
# this command creates an 'integrated' data assay
sc.combined <- IntegrateData(anchorset = sc.anchors)

Merging dataset 2 into 4 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights Error in idx[i, ] <- res[[i]][[1]] : number of items to replace is not a multiple of replacement length

This error couldn't solved by adjust parameters, such has l2.norm and k.filter.... Appreciate it if you could help.

xinmiaoyan avatar Aug 24 '22 17:08 xinmiaoyan

Hi, what are the sizes of your datasets in sc.list? k.weight (the number of neighbors to consider when weighting anchors) in IntegrateData() defaults to 100 so if one of your datasets is smaller than 100 it will cause this error to occur. Have you tried adjusting the k.weight parameter to the size of your smallest dataset? Otherwise, you may want to remove very small datasets.

Gesmira avatar Aug 26 '22 19:08 Gesmira

@Gesmira, yes, I tried to adjust k.weight. the smallest sc in my sc.list has 130 cells, when I adjusted k.weight to 50, this error was still here and didn't be resolved.

xinmiaoyan avatar Aug 26 '22 19:08 xinmiaoyan

Have you tried also setting k.filter inFindIntegrationAnchors() before you run IntegrateData()? For example:

sc.anchors <- FindIntegrationAnchors(object.list = sc.list, anchor.features = features, reduction = "rpca", k.filter = 100)
sc.combined <- IntegrateData(anchorset = sc.anchors, k.weight = 100)

Gesmira avatar Aug 26 '22 19:08 Gesmira

yes, I tried k.filter = 30, it didn't work for this error

xinmiaoyan avatar Aug 26 '22 20:08 xinmiaoyan

Just to see if the issue is due to the small dataset, are you able to run the integration through without it?

Gesmira avatar Aug 26 '22 20:08 Gesmira

I removed the smallest 130 sc, and it works this time. and the smallest one is 205, I'm wondering what's the cutoff for the smallest obj that can be used to integrate? thanks

xinmiaoyan avatar Aug 26 '22 20:08 xinmiaoyan

yes, I did checked the size of sc.list, the smallest one is 130, which is higher than 100. And I also adjusted k.weight to 50, also the same error.

On Fri, Aug 26, 2022 at 2:08 PM gesmira @.***> wrote:

Hi, what are the sizes of your datasets in sc.list? k.weight (the number of neighbors to consider when weighting anchors) in IntegrateData() defaults to 100 so if one of your datasets is smaller than 100 it will cause this error to occur. Have you tried adjusting the k.weight parameter to the size of your smallest dataset? Otherwise, you may want to remove very small datasets.

— Reply to this email directly, view it on GitHub https://github.com/satijalab/seurat/issues/6341#issuecomment-1228826037, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATZU2PDNHFI6M625ZHZHSWLV3EI3PANCNFSM57QEIF4Q . You are receiving this because you authored the thread.Message ID: @.***>

xinmiaoyan avatar Oct 11 '22 07:10 xinmiaoyan

I got the lowest cell number "3770" for a smaller object. It is same error. tried playing with different parameters k.filter, k.anchor, k.score, no luck so far.

decodebiology avatar Oct 27 '22 20:10 decodebiology

I also got similar error with my dataset integration. When I removed two samples with less than 120 cells and removed the old 'integrated' assay in the seurat object, this error went away. Just noting it here.

ajynair avatar Jan 12 '23 20:01 ajynair

I got the same error. The lowest cell number is 375. I removed it and still the same error persists. I further removed all data lower than 1,000 and still the same error shows up. The remaining data that I have are in 1k-3k+ range.

levinhein avatar Jun 20 '23 20:06 levinhein

@levinhein , I have also seen that when samples have large differences in cell numbers (10 fold) then also this problem comes. I was wondering if we could artificially split the large samples so that all samples have comparable number of cells.

ajynair avatar Jul 03 '23 16:07 ajynair

Hi, We are adding a more informative error message for this issue currently in the development branch. It seems to occur when the number of anchor cells is less than the k.weight parameter which is the number of neighbors to use when weighting anchors. For now, I would recommend reducing k.weight, combining samples if there are few cells in certain samples, or adjusting parameters to FindIntegrationAnchors (which can be provided as inputs to IntegrateLayers) such as increasing k.anchor to increase the number of anchors/cells which act as anchors. We recommend checking the results of your integration if these parameters are changed to ensure the results are satisfactory.

Gesmira avatar Jul 07 '23 21:07 Gesmira