SIRF icon indicating copy to clipboard operation
SIRF copied to clipboard

array interface

Open casperdcl opened this issue 8 months ago • 13 comments

  • Expose {Image,Acquisition}Data.__array_interface__^np-interop & asarray()
  • related to #901 & #1305
  • [ ] depends on at least one of:
    • easy: #1317 <- #1316
    • hard: https://github.com/UCL/STIR/issues/1468
  • [ ] depends on https://github.com/UCL/STIR/pull/1592

open follow-up issues

👉 diff sans whitespace

To test, append after https://github.com/SyneRBI/SIRF/blob/1d7b1ba85efddd6044e4a267134e3ef53aef80f8/examples/Python/PET/acquisition_data.py#L83

img_data = acq_data.create_uniform_image(2.0)
new = img_data.asarray()  # zerocopy view
old = img_data.as_array() # deepcopy
assert (new == old).all()

new.flat[0] += 6
assert (new == img_data.as_array()).all()

  • [x] I have performed a self-review of my code
  • [x] I have added docstrings/doxygen in line with the guidance in the developer guide
  • [ ] I have implemented unit tests that cover any new or modified functionality
  • [x] The code builds and runs on my machine
  • [ ] CHANGES.md has been updated with any functionality change

casperdcl avatar Apr 14 '25 13:04 casperdcl

Tried the firs bit:

img_data = acq_data.create_uniform_image(2.0)
new = np.asarray(img_data)  # zerocopy view
old = img_data.as_array()   # deepcopy
assert (new == old).all()

got

Traceback (most recent call last):
  File "/home/sirfuser/devel/buildVM/sources/SIRF/examples/Python/PET/acquisition_data.py", line 177, in <module>
    main()
  File "/home/sirfuser/devel/buildVM/sources/SIRF/examples/Python/PET/acquisition_data.py", line 87, in main
    assert (new == old).all()
AssertionError

evgueni-ovtchinnikov avatar Apr 14 '25 14:04 evgueni-ovtchinnikov

added

for i in range(10):
        print(new[i,i,i], old[i,i,i])

got

2.0 2.0
7.9634825e-28 2.0
2.0704136e-19 2.0
0.0 2.0
4.8617305e+30 2.0
1.8062296e+28 2.0
8.1360236e+32 2.0
1.12572005e+24 2.0
5.67159e-11 2.0
1.1722607e-19 2.0

evgueni-ovtchinnikov avatar Apr 14 '25 15:04 evgueni-ovtchinnikov

Sorry, this cannot work with @evgueni-ovtchinnikov's quick-fix in https://github.com/UCL/STIR/pull/1588. It's no good to pretend the array is contiguous if it isn't. We need to fix https://github.com/UCL/STIR/issues/1468 first.

KrisThielemans avatar Apr 14 '25 21:04 KrisThielemans

Yup that explains why the first entry new[0,0,0] == old[0,0,0] but not the rest.

casperdcl avatar Apr 14 '25 22:04 casperdcl

As I have finally realised to my dismay, a DiscretisedDensity object is still an array of arrays of arrays - I was somehow under impression that it was recently re-implemented to have its data as a one-dimensional array, which caused my confusion about contiguity, sorry!

Perhaps the one-dimensional one is implemented on some experimental STIR branch?

If not, then our attempts to access STIRImageData data from Python are unlikely to succeed I am afraid...

evgueni-ovtchinnikov avatar Apr 15 '25 10:04 evgueni-ovtchinnikov

Stir arrays now can by continuous, and were expected to be in cases that it matters, but clearly in some conditions they aren't. So, a (hopefully small) STIR fix is indeed required

KrisThielemans avatar Apr 15 '25 11:04 KrisThielemans

The culprit appears to be the constructor of STIRImageData from STIRAcquisitionData, which calls the constructor of VoxelsOnCartesianGrid from const shared_ptr<const ExamInfo> and const ProjDataInfo - reading ImageData object from file creates a contiguous array. @casperdcl can you please try your approach with img_data read from a file, e.g.:

    img_data = pet.ImageData(existing_filepath(data_path, 'mMR/mu_map.hv'))

evgueni-ovtchinnikov avatar Apr 15 '25 16:04 evgueni-ovtchinnikov

great. @evgueni-ovtchinnikov can you add relevant STIR lines in the STIR issue? i.e. where it is contiguous and where it isn't.

KrisThielemans avatar Apr 16 '25 07:04 KrisThielemans

Big news - see my last comments on https://github.com/UCL/STIR/issues/1468!

evgueni-ovtchinnikov avatar Apr 29 '25 15:04 evgueni-ovtchinnikov

I have implemented a quick contiguity fix that we could use before VoxelsOnCartesianGrid(exam_info_sptr_v, proj_data_info,...)::grow(range) is sorted out for contiguity properly. I cannot push to Casper's branch, so I did my fix on my own branch array-ptr-qfix.

Have a nice long weekend!

evgueni-ovtchinnikov avatar May 02 '25 17:05 evgueni-ovtchinnikov

I pushed it here; it's a no-conflict rebase.

casperdcl avatar May 03 '25 03:05 casperdcl

Latest performance comparison (with #1317 + UCL/STIR#1592 + #1314) - the last column shows timings we can get with asarray.

test     sirf    as_array +   numpy  +   fill   =  total   asarray
x * 2  7.28e-02  5.21e-01 + 5.53e-03 + 5.18e-01 = 1.04e+00 8.37e-03
x + 2  7.87e-02  5.26e-01 + 5.76e-03 + 4.99e-01 = 1.03e+00 7.77e-03
x + y  9.43e-02  1.08e+00 + 6.31e-03 + 5.02e-01 = 1.59e+00 1.16e-02
x * y  1.11e-01  1.06e+00 + 6.85e-03 + 5.07e-01 = 1.57e+00 9.82e-03
x / y  1.28e-01  1.01e+00 + 7.28e-03 + 5.13e-01 = 1.53e+00 8.14e-03

-- @evgueni-ovtchinnikov from https://github.com/SyneRBI/SIRF/pull/1317#issuecomment-2859385588

casperdcl avatar May 08 '25 09:05 casperdcl

what is the minimum numpy version for this?

There is no minimum; the API is very old.

casperdcl avatar May 09 '25 08:05 casperdcl

Is this ready for merge @KrisThielemans @evgueni-ovtchinnikov?

Are we ok with the tests?

Also I'm unsure which line(s) need STIR_VERSION checks or if this PR really depends on https://github.com/UCL/STIR/pull/1592

casperdcl avatar Jun 24 '25 09:06 casperdcl

I'm a bit frustrated here unfortunately, as I'm sure both of you are as well.

  • I see a number of "done" and resolved comments, but I can't see any changes in the code. A rebase/whatever gone wrong?
  • I see no tests.
  • I see no reason for extra STIR_VERSION as the code builds on GHA.

KrisThielemans avatar Jun 24 '25 11:06 KrisThielemans

I see a number of "done" and resolved comments, but I can't see any changes in the code. A rebase/whatever gone wrong?

No idea. For instance, I definitely removed some code you suggested to remove. (Actually, I never delete a piece of code, just move it to a Notepad++ text file, and I do see the removed lines there:

		virtual size_t address() {
			THROW("data address defined only for data in memory");
		}

...

		virtual size_t address() {
                    auto *pd_ptr = dynamic_cast<stir::ProjDataInMemory*>(data().get());
		    return reinterpret_cast<size_t>(pd_ptr->get_data_ptr());
		}

...

		size_t address() {
		    return reinterpret_cast<size_t>(_data->get_full_data_ptr());
		}

but they are still in stir_data_containers.h!)

Perhaps the VM I was working on crashed before I pushed?

Have to go through your comments again then...

I see no tests.

They are not part of SIRF yet.

pet_algebra_timings.zip

evgueni-ovtchinnikov avatar Jun 24 '25 12:06 evgueni-ovtchinnikov

Luckily not too many comments...

I see no tests.

They are not part of SIRF yet.

pet_algebra_timings.zip

I wasn't talking about timing tests, but functionality tests, that will be run by ctest. Some simple things like

  • asarray(copy=True) gives identical results to as_array()
  • if supports_array_view
    • asarray() gives identical results to as_array()
    • modify result of asarray() and check if it is modified. (you have this in the demo, but really it needs to be a test).

Those tests could/should be run for every type of data container, really (certainly as I think you have enabled it for Reg already, but I might be wrong).

KrisThielemans avatar Jun 24 '25 13:06 KrisThielemans

Thanks @casperdcl. @evgueni-ovtchinnikov I believe only the CI tests remain then.

KrisThielemans avatar Jun 24 '25 14:06 KrisThielemans

I think we have problems now with pull after a rebase. @casperdcl can you try and resolve this and then tell @evgueni-ovtchinnikov how to reset (presumably git reset --hard origin/array-ptr). Then avoid rebasing until we're done (could then do some clean-up rebasing if you really want)

KrisThielemans avatar Jun 24 '25 16:06 KrisThielemans

@KrisThielemans I pulled array-ptr and found the changes you requested done! Weird...

evgueni-ovtchinnikov avatar Jun 24 '25 16:06 evgueni-ovtchinnikov

Careful both. conflicting updates and rebases again I believe

KrisThielemans avatar Jun 25 '25 13:06 KrisThielemans

Careful both. conflicting updates and rebases again I believe

We're on a call no worries. Think it's fixed now.

casperdcl avatar Jun 25 '25 13:06 casperdcl

We have https://github.com/SyneRBI/SIRF/blob/c4aec0ad43325a539437685e0efe0a00157b7b6f/src/xSTIR/pSTIR/tests/CMakeLists.txt#L24-L26 and https://github.com/SyneRBI/SIRF/blob/ac459af4fdc1c7fa85de6ac4a8eb4ef6251360dc/examples/Python/PET/run_all.py#L32-L36 so I guess it's running

python  asarray.py --non-interactive

which fails. (docopt should really print the failing command line)

KrisThielemans avatar Jun 25 '25 19:06 KrisThielemans

Argh missed --non-interactive in my debug commit :see_no_evil:

docopt should really print the failing command line

We should not be using os.system to run things. At a minimum we should use subprocess.check_output or even better just pytest.

casperdcl avatar Jun 26 '25 09:06 casperdcl

success! Clean-up time :-)

KrisThielemans avatar Jun 26 '25 09:06 KrisThielemans

We should not be using os.system to run things. At a minimum we should use subprocess.check_output or even better just pytest.

agreed. but separate PR/issue

KrisThielemans avatar Jun 26 '25 09:06 KrisThielemans

🥳

KrisThielemans avatar Jun 26 '25 14:06 KrisThielemans