JDLL
JDLL copied to clipboard
Confusion with axis order (F vs C order, ImgLib2 vs NumPy)
I think there is quite a bit of unnecessary back-and-forth transposition of axes currently.
Here is how it should work:
Assume Python wants a numpy.ndarray with axis CYX (in c order) and shape [2,3,4].
That is, 2 channels, height=3, width=4.
So in total 2 * 3 * 4 = 24 elements.
Let's make a flat buffer containing elements (0, 1, 2, ..., 23).
For Python, if we just reshape([2,3,4]) that, we get
[[[ 0. 1. 2. 3.]
[ 4. 5. 6. 7.]
[ 8. 9. 10. 11.]]
[[12. 13. 14. 15.]
[16. 17. 18. 19.]
[20. 21. 22. 23.]]]
as desired.
For ImgLib2, the same flat buffer containing elements (0, 1, 2, ..., 23) wraps in f order with axis XYC and shape [4,3,2].
Translating that to EfficientSamJ:
SAM wants [3,h,w], so CYX in c order.
We could just wrap a flat array as XYC with dimensions [w,h,3] in f order in ImgLib2.
Then reshape the same flat array in python as [3,h,w] and be done.
Instead, currently:
- We wrap the flat array as
CYX(dimensions[3,h,w]) in f order in ImgLib2. - Then we use Views to transpose the axes to
XYC(dimensions[w,h,3]) to be able to write to it "normally". - Because we wrapped as
CYXin f order, that is then of courseXYC(shape[w,h,3]) in c order in Python. - We have to do the
np.transpose(im.astype('float32'), (2, 0, 1))in order to pass it to pytorch.
We could avoid both the Views transposition and the np.transpose.
Also, in np.transpose(im.astype('float32'), (2, 0, 1)) the order (2,0,1) should be (2,1,0) probably.
I think we pass the images with X and Y axis flipped. SAM doesn't care of course. Probably there is a corresponding flip in the coordinates of the prompt point list etc (maybe even another explicit np.transpose?)