Flye icon indicating copy to clipboard operation
Flye copied to clipboard

Anomaly in resource usage

Open VendelboNM opened this issue 11 months ago • 4 comments

Hi there!

I am conducting whole-genome assembly of some barley cultivars using ~15x coverage ONT R10.4 sequence data and while the first 7 cultivars used 3500-7500 CPU hours and 400-550 Gb RAM at peak, i have two cultivars that have run for more than 10000 CPU hours, and i've had to increase the max ram to 1200 Gb as they ran out of memory at 950 Gb. Nothing in the processed data distinguishes these two assemblies. Any idea what could be going on here?

I am using very basic assembly parameters and have set the minimum overlap rather low as we have a rather small dataset for whole-genome assembly of such a large genome.

flye --nano-hq {input} --read-error 0.03 --min-overlap 1000 --keep-haplotypes -t {threads} -i 1 -g 5.1g -o {outdir}

VendelboNM avatar Mar 22 '24 07:03 VendelboNM

Update: Runtime of 16300 CPU hours now...stuck at the 02_repeat stage without anything new for more than a week.

VendelboNM avatar Mar 24 '24 20:03 VendelboNM

Could you please share the flye.log file? Sometimes super repetitive genomes may stuck during graph cosntructions step..

mikolmogorov avatar Apr 03 '24 00:04 mikolmogorov

Hi!

Thank you for taking a look at it, i have attached the log file here as requested:

an_flye.log

VendelboNM avatar Apr 08 '24 16:04 VendelboNM

It is running out of memory. I recommend to use the automatically selected minimum overlap parameter instead of 1000 - this should substantially reduce the complexity of repeat graph.

mikolmogorov avatar Apr 12 '24 14:04 mikolmogorov

Hi Mikhail,

Thank you for the suggestion! I’ve removed the parameter so that Flye can select it automatically. Fingers crossed that this will do the trick.

Wish you a fantastic weekend,

Best regards, Nikolaj Vendelbo

From: Mikhail Kolmogorov @.> Reply to: fenderglass/Flye @.> Date: Friday, 12 April 2024 at 16.43 To: fenderglass/Flye @.> Cc: Nikolaj Meisner Vendelbo @.>, Author @.***> Subject: Re: [fenderglass/Flye] Anomaly in resource usage (Issue #687)

It is running out of memory. I recommend to use the automatically selected minimum overlap parameter instead of 1000 - this should substantially reduce the complexity of repeat graph.

— Reply to this email directly, view it on GitHubhttps://github.com/fenderglass/Flye/issues/687#issuecomment-2051897490, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ARVJFMO3PAY7SLP33XBUNNLY47XKVAVCNFSM6AAAAABFC2BCN2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJRHA4TONBZGA. You are receiving this because you authored the thread.Message ID: @.***>

VendelboNM avatar Apr 12 '24 18:04 VendelboNM

SOLVED

Thank you for the suggestion - it ran the assembly with no issue, faster than other samples and at a decent memory usage of 500 Gb. Fantastic start of a new week!

VendelboNM avatar Apr 15 '24 07:04 VendelboNM

Glad it worked!

mikolmogorov avatar Apr 18 '24 21:04 mikolmogorov