napari icon indicating copy to clipboard operation
napari copied to clipboard

Texture tiling

Open guiwitz opened this issue 2 months ago • 29 comments

References and relevant issues

Aims to close https://github.com/napari/napari/issues/7149

Description

This PR is an attempt at fixing at least partially the following issue: images that are not multiscale and that have one pixel dimension larger than the GL_MAX_TEXTURE_SIZE are simply downscaled to be rendered in napari. The consequence is that even when zooming, one can never see the full resolution of such an image.

For the 2D case, one solution to that problem is to break the images into tiles whose size is smaller than GL_MAX_TEXTURE_SIZE. As napari doesn't have a tiling mechanisms, this tiling needs to happen at the vispy nodes level. The solution proposed here is a new vispy node called TiledImageNode which is composed of a set of Image nodes. The TiledImageNode is used in a similar way as the ImageNode and VolumeNode i.e. the ImageLayerNode picks the correct type of node given the dimensionality and data size. This choice is made in the get_node function of ImageLayerNode. That function had to be modified to accept the data size as parameter so that the check of size could be done.

Remaining issues:

  • The main issue is the conservation of the correct node upon updating the viewer. At the moment the TiledImageNode eventually gets replaced by the default ImageLayerNode. The TiledImageNode children are conserved but reattached to the ImageLayerNode. Therefore when changing e.g. the colormap, the change does not get propagated to children as the ImageLayerNode is not designed for that. The children can be explicitly set but this is not a good solution. The problem happens somewhere here because of a bad update sequence during which a node update happens while data are None (and hence their size below the tiled threshold)
  • Currently the size check is done on the original data shape but it should be done on the sliced data

guiwitz avatar Oct 30 '25 06:10 guiwitz

I know it's WIP, but I wanted to stop in and say that This works! 🎉 Pretty awesome, congrats!

But... I still have major reservations regarding memory/VRAM usage of this approach vs. auto-multiscale. I'm on Apple Silicon, so unified memory, but OpenGL doesn't leverage that, so there is a copy involved.

Here's my test setup: set up a viewer and load a modest, but larger than GL limits image shape: (32914, 46000, 3) using tifffile.imread (just loading to numpy data = imread... no layer yet) This makes Python interpreter memory go from 0.4Gb to 4.6GB Then, with main:

  • add_image, auto downsampled: memory jumps to 5.3GB, so we can assume ~0.7GB of VRAM? (it took ~1s)
    • Note: deleting the layer doesn't free the memory, need to close the viewer.
  • add_image([data, data[::2, ::2, :], data[::4, ::4, :]]), make multiscale, memory jumps to: 5.1GB, so assume 0.5GB VRAM (it was basically instant)

With this PR:

  • add_image, no downsample/multiscale: memory jumps to: 10.5GB, so we can assume ~6GB VRAM? (takes ~18 s FYI)

So that's my concern, based on having used my script that implements something like this. The VRAM usage can quickly become excessive I think. It feels like it would need to be optional, to me at least, with auto-multiscale the default?

psobolewskiPhD avatar Nov 01 '25 00:11 psobolewskiPhD

@psobolewskiPhD im excited it works for RGB images, we weren't consciously supporting that! 😂

I disagree that the VRAM issue is a blocker. I'm much more comfortable explaining to users "your data is huge compared to your machine specs, that's why you're running out of VRAM, here's how to multiscale it", than "even though your data is small compared to your machine specs, napari can't even display it accurately, here's how you multiscale it."

Multiscale performance still kinda sucks, too, so I like this as a pathway for folks with beefy computers — and I mean, not that beefy, plenty of folks have 16GB RAM now, and plenty of laptops have integrated graphics where the RAM is the VRAM. Not to mention that there are implementation issues with auto multiscale that we discussed in my original issue and that prompted you to suggest tiling instead!

In short, the order of magnitude of VRAM is usually similar to the order of magnitude of RAM, so if you have the data in RAM already, there's a fair chance you can chuck it in VRAM also, so we might as well try. When it fails we can guide users to the right approach. Or, even, in a follow up PR, we can fall back on downscaling if we get a memory error.

By the way, re the shared RAM and forced copy, is that really an OpenGL limitation, or OpenGL on Mac (iirc I didn't have this issue on my Lenovo with integrated graphics), or VisPy?

jni avatar Nov 02 '25 01:11 jni

@jni I don't want to derail, so maybe we should open a dedicated discussion somewhere for this topic. But, I'm less concerned about macs with Apple Silicon, but rather people with discrete GPU. Until recently, many of those devices "for gaming" had as little as 2 or 4GB VRAM, and even now my wife has a 4060 with just 8Gb. How much of that can napari take? My experience using my script on my MacBook Pro (M1, 16Gb Apple Silicon, so pretty long in the tooth I guess) has been that there is a limited window of image sizes where it was beneficial, so i've basically stopped using that. What I've found is that for napari multiscale performance more coarse levels help. So with the above size, viewer.add_image([data, data[::4,::4,:], data[::12,::12,:]]) (or ::16) is much better than [data, data[::2, ::2, :], data[::4, ::4, :]] -- which is enough to get it into the texture limit. Pan/Zoom-in are totally smooth. Only issue really is with zooming out (https://github.com/napari/napari/issues/7391)

psobolewskiPhD avatar Nov 02 '25 17:11 psobolewskiPhD

@psobolewskiPhD I don't think "totally smooth" accurately describes the multiscale experience, even in the best case of all in-memory data. Here's a video using this PR, where I've loaded the same 50k x 25k dataset (available here) as either texture-tiled single scale or as multiscale. Notice the difference between pan-zoom (let alone zoom) in multiscale vs this auto-tiling:

import tifffile
import napari

im0 = tifffile.imread('/Users/jni/data/noirlab/noirlab2027a.tif')
im1 = im0[::2, ::2]
im2 = im1[::2, ::2]
im3 = im2[::2, ::2]

pyramid = [im0, im1, im2, im3]

viewer = napari.Viewer()

flat_image_layer = viewer.add_image(im0, rgb=True)
multiscale_layer = viewer.add_image(pyramid, multiscale=True, visible=False)

napari.run()

https://github.com/user-attachments/assets/7db04343-626a-414e-b792-0e2881602f15

I can't say that the multiscale isn't "responsive" or "useful", but it's jittery and unpleasant, while the tiled version gives you that napari "wow" feeling 🤩 which until this PR was actually impossible to achieve. I think that's so cool!

Yes, some people will have tiny graphics cards, and instead of a lossy downsampling they'll be forced to actually multiscale their data. That's annoying, but I can flip your argument about "limited use cases" around — I don't think many folks are on tiny machines expecting that loading a 4GB array should Just Work.

Even in the cases where they are, I think we should merge this (after much clean up, which is well-understood from the hackathon), then follow up with a try/except for whatever the failure is on a small-VRAM machine.

In summary:

  1. This PR lets me look at data in a smooth way that was previously impossible — there is no user-side code that can get me this behaviour.
  2. Auto-multiscale is instead convenient but does not fundamentally change what napari can do — and users can multiscale any time — even as a reader plugin.
  3. The users who actually want autodownsample are, I believe, very few, while those that want to Just Look at these images are a lot.
  4. We can add auto-multiscale for those users in the future, e.g. as a try/except when we run out of VRAM.

jni avatar Nov 06 '25 13:11 jni

@guiwitz let me know when you're back home and a have a free morning — happy to work with you to push this over the line!

jni avatar Nov 06 '25 13:11 jni

Just a shower thought, submitted here for @brisvag's consideration:

  • the issue we are encountering is that the overlays are children of the main node.
  • So when we change nodes because we change the data, we have to transfer all the children over.
  • So our tiled node with image nodes as children causes a mistake
  • So, proposed solution: overlays should be siblings of a main node that has the "data" node as one of its children.

jni avatar Nov 07 '25 10:11 jni

I managed to avoid the issue with reparenting the tiled nodes by just checking the type of the children and skipping the tiles. I now also removed the "forced" updates of the cmap and contrast to have a clean state. With these two changes, the implementation is functional except for the initialization. At the moment it is necessary to go through one step of data change, e.g. one round of transpose, for the data to become visible and the rendering changeable. I didn't manage to figure out what the on_data_change exactly triggers that is not triggered before.

guiwitz avatar Nov 10 '25 11:11 guiwitz

So, proposed solution: overlays should be siblings of a main node that has the "data" node as one of its children.

Seems reasonable to me, if I understand correctly!

To follow up on the discussion above on multiscale vs tiling (and to annoy @jni by repeating this again xD): multiscale and tiling should be separate concepts, and right now they are entangles which makes this more complicated. IMHO the ultimate version of this would work this way:

  • user passes an arbitrary image, with or without pyramid
  • napari takes this image and passes it to vispy glue code
  • vispy glue code chunks the image into tiles and creates a tiled node (like the one in this PR)
  • individual tiles are lazily loaded depending on camera position; tiles are also loaded if they are just outside the view, so we don't get those empty areas when moving the camera that we currently get with multiscale tiling. This also would solve our current performance problems with multiscale tiling, because it would load pre-chunked tiles and not slice out a new tile every time the camera moves 1 pixel.
  • if the data is a pyramid, zoom may also affect what data level is used for the tile and the density of its pixels (but importantly this is independent/unrelated to everything above).

This should cover our needs for both our current tiling and what this PR is aiming to achieve. I think this PR is a good setup for all this.

brisvag avatar Nov 10 '25 15:11 brisvag

if the data is a pyramid, zoom may also affect what data level is used for the tile and the density of its pixels (but importantly this is independent/unrelated to everything above)

For the best experience, I think we may want these to work together. This is important when mixing chunks from different levels as they will have different spatial coverage, particularly for blending across channels with different tiles and scales loaded. I might be missing some ideas but for a given channel I would expect to maybe need some tricks with depth/stencil buffer or polygon offset to ensure the best (most appropriate) available level of detail is drawn.

aganders3 avatar Nov 10 '25 20:11 aganders3

Not sure I follow... All tiles should always be from the same LOD, you should never have mixed levels. Am I misunderstanding?

brisvag avatar Nov 11 '25 09:11 brisvag

Yeah mixed-LOD would only be a transient state while loading remote data, for example. If you zoom in I think it's better to continue showing the lower-res tiles until the higher-res data loads, and it will feel more responsive if you can see the higher-res tiles filling in as they are ready.

aganders3 avatar Nov 11 '25 15:11 aganders3

Just as a follow-up to our discussion @jni, I know suppressed the check whether the data have changed or not. On top of the fact that it prevented correct updates, after doing some timing tests, I realized that the check itself was really slow. For a 30kx60k image just checking data equality took 4s! What I however did was to now avoid recreating the tiles for every data update. This had two issues: recreating tiles meant copying all the properties to the new tiles (colormap, intensity range etc.) and second it was slow. So now I check whether the number of tiles has changed, and if it has not changed I only update the data on the existing tiles. Now even rotation is reasonably slow and everything else seems functional.

I realized however that this only covers images. If one tries to add a label layer then one gets again some (other) GL error. I tried to create an equivalent TieldLabelNode and I'm slowly progressing but before I continue I just wanted to check whether that's the right direction or if I'm complicating things. It seems (unsurprisingly) that labels layers behave quite differently from images and so just recycling the TiledImageNode wasn't sufficient.

guiwitz avatar Nov 14 '25 11:11 guiwitz

I would say let's leave labels until a later PR because as you have noticed there's a lot more stuff going on with labels layers. Instead try to maintain the existing behaviour when the layer is a labels layer.

jni avatar Nov 14 '25 12:11 jni

@guiwitz I didn't have time to work on this today but will in the next couple of days. Please let me know if you would like e.g. help getting the tests to pass, or with other things. Thanks for all your work! 🙏

jni avatar Nov 17 '25 13:11 jni

Codecov Report

:x: Patch coverage is 98.48485% with 1 line in your changes missing coverage. Please review. :white_check_mark: Project coverage is 93.00%. Comparing base (70e5fcc) to head (4a67520).

Files with missing lines Patch % Lines
src/napari/_vispy/layers/tiled_image.py 97.56% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8395      +/-   ##
==========================================
+ Coverage   92.95%   93.00%   +0.04%     
==========================================
  Files         712      713       +1     
  Lines       63871    63927      +56     
==========================================
+ Hits        59374    59455      +81     
+ Misses       4497     4472      -25     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Nov 17 '25 23:11 codecov[bot]

@jni I think I fixed most of the issues. Two tests are failing but my impression is that the failures are unrelated to the PR itself, but maybe you can have a look.

I modified the test that used to check if very large images were downscaled. I'm just checking that in that case the node is indeed a TiledImageNode but maybe we can do better.

I also wasn't sure how I should update my branch: update with rebase or with merge commit?

Finally, I see now that, even though I didn't touch the labels layer, I get an error message when trying to just display a large label image. I will fix that and hopefully then this PR will be out of WIP.

guiwitz avatar Nov 18 '25 08:11 guiwitz

I fixed the issue with the labels layer. I had to add back the downscaling for that specific case since Label and Image layer share the scalar_field logic. All tests are passing. So @jni feel free to have a look and tell me if there's anything to change/fix.

guiwitz avatar Nov 18 '25 10:11 guiwitz

Amazing @guiwitz! Reviewing now! The code looks very close. I will spend more time on it but I wanted to flag this issue in case I get interrupted: if you run python examples/viewer_fps_label.py, then turn on transform mode, with this PR, the interaction box coordinates are shifted:

screenshot_ebkJTp8l@2x

However, hover shows that the actual location of the box is correct (ie the points are actually on the corners of the image pixels), and as soon as you move a vertex, the whole box snaps to the correct position. So, it looks like there is an update to the node children that is missing in this PR.

jni avatar Nov 19 '25 06:11 jni

@jni I noticed I get the same behavior on main. I went a few versions back and it seems the problem has appeared at ~~napari=0.6.1~~ napari=0.6.2. So it seems unrelated to this PR.

guiwitz avatar Nov 19 '25 08:11 guiwitz

Hmmm, I don't see the same behaviour on main... Are you sure??

Oh wait! I am on latest main, which includes #8423, which fixes the problem!!! Merging main into this branch also fixes it! Sorry for the noise! But that's great! 🤩

jni avatar Nov 19 '25 12:11 jni

(note: I've pulled in main to this branch, be sure to pull if you want to make further changes)

jni avatar Nov 19 '25 12:11 jni

My bloody mouse isn't dragging right, so originally submitted the review. I'll take that opportunity to summarize some loose ends:

  1. Grid mode is broken. Images seem to be transformed in space, but then escape the viewboxes, or something (I thought these were discrete and separate). Note how you can fully drag an image in the right grid visibly through the entire canvas. Colorbar is also broken in grid (but seem ok in stack mode), where they don't all appear. Another way that this was observed was in the single image script with then adding a labels layer (was downsampled, ofc) and then transforms were also not correct. @brisvag since you are the most recent to work on both grid and overlays, @guiwitz agreed that it would be great if you could take a look and potentially submit any fixes or work with him directly.
  2. Quiet crashing when RAM is exceeded. It would be nice if we could somehow prevent such a bad outcome, at minimum a warning would be superb.
  3. Roll/transpose lead to a surprising result, but my brain doesn't wrap around this concept very well.

Otherwise, I think this is surprisingly smooth and performant, even with trying to paint a downsampled labels on top of a tiled image. I've run into this texture limit many times, especially on laptops without dedicated GPUs, but would have plenty of RAM to service this texture tiling nicely. I tested various layer attributes (i.e. opacity, blending, gamma, scale, etc) and all seem great, including with multiple layers. Toggling to other mods, like 3D

Follow-ups:

  1. Add API to query a tiled property of the layer
  2. Add documentation explaining a few things, like when this is triggered, how VRAM/max text size is different on different machines, and the limitations with respect to RAM

TimMonko avatar Nov 19 '25 17:11 TimMonko

The colorbar tiling is likely not an issue with this PR specifically, instead a bug also present on main see #8434

TimMonko avatar Nov 19 '25 19:11 TimMonko

Ok I finally got a chance to play with the grid mode issue. It looks like for some reason the tiled image is not respecting the grid "limits", but normal images do. See this video:

https://github.com/user-attachments/assets/c581df6f-1a8a-47f8-ba39-fc3db4e8d9b3

Note: it was a complete coincidence that I picked "astronaut" to travel through the galaxy, but now I love it. 😂

Note also how the astronaut correctly disappears when it reaches the middle of the screen, while the galaxy spills over across the whole screen.

@brisvag I presume this is because we're setting something incorrectly (or rather, passing through a getattr to the first child node instead of returning the "aggregated" answer) in the parent node... Any immediate thoughts?

jni avatar Nov 24 '25 12:11 jni

I was having this issue as well with overlays, and II could only solve it properly by destroying them and recreating them, instead of reparenting (which seemed to not work properly).

I suspect this was ultimately the reason why I added that code that slows down the scenegraph shuffling so much that partly caused https://github.com/napari/napari/issues/8421 :/

I need to investigate furthet this whole thing.

brisvag avatar Nov 24 '25 12:11 brisvag

Ok @brisvag @guiwitz first step, here's a VisPy-only example where I parent the Images to the scene, not to an intermediate node, and then there's no spillage across viewboxes:

https://gist.github.com/jni/ba2923cef7f423c5906f7de1cad0d150

Going to try using a TiledImageNode like in this PR now and see how that plays out.

jni avatar Nov 26 '25 09:11 jni

... Success!!!!

screenshot_wFSrZHxQ@2x

Of a sort: a VisPy-only reproducer: if we parent the images to the scene, it works fine, if we parent them to a Node and that Node to the scene, there is spillage.

https://gist.github.com/jni/978d2fa0ae1e0c33c0a5539a162ee692

@brisvag can you investigate on the VisPy side? On the napari side, we could do some machinery where we directly parent all of the Images to the Scene, but it would be best to avoid that I think...

jni avatar Nov 26 '25 09:11 jni

Looking into the machinery behind, it looks like indeed there is some clipping based on bounds:

https://github.com/vispy/vispy/blob/25ef9915ae3affb499f2560fa6d73ebf0be20b06/vispy/scene/widgets/viewbox.py#L54-L57

We might indeed have an issue in vispy with reporting bounds correctly across the scenegraph. If it's not a python issue, then we need to go dig deeper into the shader implementation of this Clipper class...

brisvag avatar Nov 26 '25 11:11 brisvag

aha! Super interesting @brisvag! It's bedtime here but I'll try to look at that more closely tomorrow. "clip_children" looks like it should be taking care of it, but who knows...!

jni avatar Nov 26 '25 13:11 jni