numbat
numbat copied to clipboard
error in evaluating the argument 'subject' in selecting a method for function 'findOverlaps': 'seqnames' cannot contain NAs
Hi,
I ran Numbat with WGS CNV as input; however, it always returns the error message 'seqnames' cannot contain NAs.
I don't have any NAs in count_mat_ATC2, df_allele_ATC2, or the segs_consensus file.
Can somebody help me fix this problem?
out = run_numbat(
count_mat_ATC2 # gene x cell integer UMI count matrix
ref_hca, # reference expression profile, a gene x cell type normalized expression level matrix
df_allele_ATC2, # allele dataframe generated by pileup_and_phase script
genome = "hg38",
t = 1e-5,
ncores = 24,
plot = TRUE,
segs_consensus_fix=segs_consensus,
out_dir = paste0('../out/numbat_with_segs/',scid)
)
Numbat version: 1.3.3
Scistreer version: 1.2.0
Running under parameters:
t = 1e-05
alpha = 1e-04
gamma = 20
min_cells = 50
init_k = 3
max_cost = 1714.8
n_cut = 0
max_iter = 2
max_nni = 100
min_depth = 0
use_loh = auto
segs_loh = None
call_clonal_loh = FALSE
segs_consensus_fix = Given
multi_allelic = TRUE
min_LLR = 5
min_overlap = 0.45
max_entropy = 0.5
skip_nj = FALSE
diploid_chroms = None
ncores = 24
ncores_nni = 24
common_diploid = TRUE
tau = 0.3
check_convergence = FALSE
plot = TRUE
genome = hg38
Input metrics:
5716 cells
Mem used: 53.7Gb
Approximating initial clusters using smoothed expression ..
Mem used: 53.7Gb
number of genes left: 9575
running hclust...
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
! # Invaild edge matrix for <phylo>. A <tbl_df> is returned.
Iteration 1
Mem used: 53.7Gb
Expression noise level (MSE): high (2.7). Consider using a custom expression reference profile.
Using fixed consensus CNVs
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'subject' in selecting a method for function 'findOverlaps': 'seqnames' cannot contain NAs
Thanks in advance.
Lijia
Can you print out segs_consensus
?
Sure, the data frame is
CHROM seg seg_start seg_end cnv_state
chr1 seg1 1 820000 bdel
chr1 seg2 820001 1067000 neu
chr1 seg3 1067001 1688000 loh
chr1 seg4 1688001 1695000 bdel
chr1 seg5 1695001 1721000 bdel
chr1 seg6 1721001 1733000 bdel
chr1 seg7 1733001 1936000 loh
chr1 seg8 1936001 4063000 loh
chr1 seg9 4063001 4065000 bdel
chr1 seg10 4065001 4068000 del
chr1 seg11 4068001 6561000 loh
chr1 seg12 6561001 6567000 neu
chr1 seg13 6567001 7941000 loh
chr1 seg14 7941001 8947000 loh
chr1 seg15 8947001 9536000 loh
chr1 seg16 9536001 9537000 bdel
chr1 seg17 9537001 10894000 loh
chr1 seg18 10894001 10895000 bdel
chr1 seg19 10895001 10896000 bdel
chr1 seg20 10896001 15799000 loh
chr1 seg21 15799001 15953000 neu
chr1 seg22 15953001 16506000 loh
chr1 seg23 16506001 16728000 amp
chr1 seg24 16728001 17158000 loh
Hi, do you only expect aberration on chr1? Why not include all chromosomes?
Hi @teng-gao ,
This is only a small portion of my entire file. The dataset is in-house patient data, which I cannot expose to the public. Do you need the entire file? If yes, please let me know, and I will attempt to obtain permission to share it with you.
Hey @yulijia as per the documentation of Numbat found here: Using existing CNV calls Sometimes users already have CNV calls from bulk WGS, WES, or array analysis. In this case, you can supply the existing CNV profile via segs_consensus_fix parameter to fix the CNV boundaries and states. To do so, you may provide a dataframe with the following columns:
CHROM: integer; chromosome (1-22) seg: character; segment ID (e.g. 1a, 1b, 2a, 2b, etc.) seg_start: integer; segment start position seg_end: integer; segment end position cnv_state: character; copy number state (neu, del, amp, loh, bamp, bdel) Please note that diploid segments (cnv_state = "neu") should also be included (i.e. segs_consensus_fix should be a complete copy number profile including all chromosomes).
CHROM needs to be an integer. You have a character in that col. Maybe try changing that?