KomaMRI.jl icon indicating copy to clipboard operation
KomaMRI.jl copied to clipboard

Load pulseq sequences faster and allow comparison

Open gabuzi opened this issue 7 months ago • 2 comments

This addresses #224. Currently a draft, because it hasn't been extensively tested on other pulseq versions.

The below snippet was used for comparison and also contains some further explanations. An example sequence file is here bSSFP_FA30deg_TE10ms_TR20ms_2D_(69x64)_pulseq.seq.zip. This small sequence loads quite fast, but for e.g. 3D sequences with many excitations, the performance issues and benefits from this diff are greatly increased.

Proper unit tests should of course be required. What I did here was to implement == operator for sequences (and constituent structs) to compare structurally, rather than by ref, and base the sequence equality on that. I assert that results of the existing implementation and this one are equal. There may be an existing julia package to do that equality through a macro (StructEquality.jl), rather than by hand, but I wanted to keep changes light.

There seemed to be a rather big refactor going on with KomaMRIFiles subpackage that threw my environment off. As a result, this is based on an earlier version.

using KomaMRI


seqfile = "bSSFP_FA30deg_TE10ms_TR20ms_2D_(69x64)_pulseq.seq"
# force compilation
read_seq(seqfile)
time_read_seq(f) = @time read_seq(f)
println("baseline")
seq = time_read_seq(seqfile)

# profiler says most time is spent in the sequence inner constructor.
# there's really lots of copying going on.
# the read function goes through pulseq blocks and creates a sequence object for the first block
# then, it creates a new seq object for the next block and adds them together with +
# This is implemented by creating yet another sequence object with the concatenation of first and second sequence'
# elements. Then, the sequence constructor itself contains quite some logic that may involve instantiation of
# sequence sub elements.

# One obvious choice is to implement some sort of cat operator for sequences to concatenate
# multiple blocks at once and then have the sequence constructor be called only once on the concatenated args.
# This helped, but then scanf was a new bottleneck (formatter created in the loop, etc.)
# Dictionaries, indexed via id are another source of redirection, which, according to the pulseq
# specification isn't necessary.
# I thus decided to just read the blocks section directly into one big int array
seq_blocks_no_id_sort = read_seq_via_blocks_as_int_array(seqfile)
time_read_seq_via_blocks_as_int_array(f) = @time read_seq_via_blocks_as_int_array(f)
println("optimized without scanf, without blocks id sorting and dicts")
time_read_seq_via_blocks_as_int_array(seqfile)
plot_seq(seq_blocks_no_id_sort)
@assert seq == seq_blocks_no_id_sort "different sequence results!"

# now, a large part of the time (can be up to 40%) is inferring the Nx Ny Nz that we could
# provide directly in the pulseq file...

gabuzi avatar Dec 03 '23 20:12 gabuzi