KomaMRI.jl icon indicating copy to clipboard operation
KomaMRI.jl copied to clipboard

Pulseq loading slow and potentially incorrect block ordering

Open gabuzi opened this issue 7 months ago • 8 comments

Hello, me again :)

Two aspects on Pulseq that I have noticed:

1. Speed

I have been toying a bit more with Koma, now moving to even larger pulseq files. Pulseq loading is extremely slow for those (does not finish within 20min), and @time read_seq(...) indicates a large amount of allocations.

Let's take as an example a toy cartesian 2D bSSFP sequence with 64 phase encode steps and 2 excitations per step. The result has 768 blocks, the pulseq file is 27KB in size.

Reading this with the latest master ae27d17326b7989d7fc1aceba01a051450e3d219, gives

julia> time_read_seq(f) = @time read_seq(f)
       println("baseline")
       seq = time_read_seq(seqfile)
baseline
[ Info: Loading sequence bSSFP_FA30deg_TE10ms_TR20ms_2D_(69x64)_pulseq.seq ...
  0.752135 seconds (8.51 M allocations: 299.462 MiB, 11.12% gc time)
Sequence[ τ = 3539.2 ms | blocks: 768 | ADC: 128 | GR: 636 | RF: 128 | DEF: 11 ]

julia> 

We can see 8.5M allocations totaling 300MB and a ballpark of 0.5-1s for loading (have warmed-up the function before to force compilation). A larger sequence file (6912 blocks), 220KB pulseq file comes in at ca 50s time and 21GiB allocations. Extrapolating this to, say an MRF sequence, where you could easily have two to three orders of magnitude more blocks could easily make pulseq sequence loading completely infeasible (I have tried that). This was with an M1 macbook pro.

I have spent some time digging into the sequence loading code and optimizing it. I went through several iterations and I can now load the small file above in 0.027s with 300k allocations totalling in 11.6MB.

The larger file now loads in 0.25s with 2.59M allocations for 98MiB.

2. [BLOCKS] order

Along the way I realized that Koma seems to interpret the pulseq [BLOCKS] section as being reproduced on a scanner in incrementing ID order instead of line by line in [BLOCKS]. The blocks data is read into dictionaries of (ID => block_data), and then the sequence struct is assembled by iterating over incrementing IDs. See https://github.com/cncastillo/KomaMRI.jl/blob/ae27d17326b7989d7fc1aceba01a051450e3d219/KomaMRICore/src/io/Pulseq.jl#L412 (which coincidentally is one of the main culprit lines for the slow load), and then in get_block, we have: https://github.com/cncastillo/KomaMRI.jl/blob/ae27d17326b7989d7fc1aceba01a051450e3d219/KomaMRICore/src/io/Pulseq.jl#L642 obj["blockEvents"][i] is the mapping from block with ID==i to the data from the [BLOCKS] section for this ID.

Digging through the pulseq spec 1.4.1 (my seqs are 1.4.0, but there were no relevant changes from 1.4.1 to 1.4.0): https://pulseq.github.io/specification.pdf, we find sec 2.2:

There are no restrictions on the ordering of IDs, although sequential ordering is often implemented.

And in sec 2.7, p10 top

The sequence must declare at least one block. Any non-zero number of blocks
may be defined. The blocks are executed sequentially.

While this may be seen as a bit ambiguous ("sequentially line-by-line or by-id?"), I think together with sec 2.2, I would say it's line-by-line.

This is good, because it relieves Koma from storing the BLOCKS into a Dict and relatively expensive indirect lookups, but rather the entire [BLOCKS] section can be read into one big Int array that doesn't have to be modified, which is significantly more efficient for creation and access.

This is more or less how I achieved the optimizations described above.

3. What's next?

Since these are more invasive modifications (I had to define the == operator for sequences to ensure that the results I get form different reading functions are the same), and I could only test on my pulseq files. These are all v.1.4.0 and created via pypulseq and thus probably lack the diversity allowed by the specification. So, the obvious question is: are there pulseq test files available in Koma already? I didn't find any.

I'm happy to open a PR with the optimizations so that you can have a closer look. Let me know what you think.

gabuzi avatar Nov 20 '23 12:11 gabuzi