bbimber

Results 95 comments of bbimber

@nalinigans Can we run consolidate on an existing workspace without appending new samples? Our OS is CentOS 7, and this is on a lustre filesystem.

@nalinigans We are getting another type of error now. I should add that most of these jobs (we run them per-chromosome) import 4/5 batches, and then die with no errors....

An update on this, I restarted with genomicsdb-segment-size as you suggested. here's the timing thus far: ``` 26 Feb 2022 02:19:16,306 DEBUG: 02:19:16.301 INFO GenomicsDBImport - Callset Map JSON file...

@nalinigans No, not finished but also not dead. This is all running on a slurm cluster. I will see about connecting to the node. I've never done this, but I...

@nalinigans I'm afraid we're back to failing. You can see from the timestamp the time between Batch 4 and failure: ``` 26 Feb 2022 02:19:16,312 DEBUG: 02:19:16.301 INFO GenomicsDBImport -...

@nalinigans We really appreciate you help and suggestions on this. I'll try consolidate_genomicsdb_array. One question: does this modify the workspace in-place? If so, I assume we should clone the workspace,...

@nalinigans OK, I just added it to our code to optionally run this consolidate tool prior to running GenomicsDBImport. I'll start a trial later this morning.

@nalinigans OK, so most of these jobs are still going (we run per contig); however, one just died as follows: ``` 03 Mar 2022 12:49:28,390 INFO : /home/exacloud/gscratch/prime-seq/bin/consolidate_genomicsdb_array -w /home/exacloud/gscratch/prime-seq/workDir/0950f56c-7565-103a-a738-f8f3fc8675d2/Job3.work/WGS_1852_consolidated.gdb...

@nalinigans another update. i've been running the standalone consolidate tool, per chromosome. Below is chr 9. As you can see, it seems to take nearly a full day per attribute....