nerfstudio icon indicating copy to clipboard operation
nerfstudio copied to clipboard

allow "rgb depth" view

Open machenmusik opened this issue 1 year ago • 13 comments

For methods that provide "rgb" to the viewer as well as "depth" this PR allows the selection of "rgb depth" output in the viewer to see both side-by-side.

machenmusik avatar Feb 06 '23 21:02 machenmusik

This will cause the scene to get squashed, also when you overlay the dataset, it won't align carefully. I'm not sure what the correct way to handle this is.

image

tancik avatar Feb 06 '23 23:02 tancik

Side by side "squashing" is OK in this case IMO since the user is specifically requesting it.

It is a good point that the images overlay may want to be presented only on the one side, if that is desirable.

In theory the technique should allow seeing two views more generally, I just chose "rgb depth" as the two things I wanted to see at once.

machenmusik avatar Feb 07 '23 02:02 machenmusik

The change to make overlay match would be pretty simple in ViewerWindow.jsx, but for some reason the "rgb depth" setting isn't correctly visible to handleResize

  const output_choice = useSelector((state) => state.renderingState.output_choice);

  const handleResize = () => {
    // FIXME: output_choice isn't being detected as "rgb depth" here, but rather as "rgb"
    const widthDivisor = output_choice === 'rgb depth' ? 2 : 1;
    const viewportWidth = get_window_width() / widthDivisor;
    const viewportHeight = get_window_height();

machenmusik avatar Feb 07 '23 05:02 machenmusik

An alternative option would be to do "splitview", where we just crop the image, rather than squash, concat(rgb[:,:width/2,:], depth[:,width/2:,:]).

tancik avatar Feb 07 '23 06:02 tancik

An alternative option would be to do "splitview", where we just crop the image, rather than squash, concat(rgb[:,:width/2,:], depth[:,width/2:,:]).

I didn't want to change the viewer behavior with respect to the FOV it shows, etc. But another way to do it would be to shrink the webrtc view as well as the overlay, preserving aspect ratio.

machenmusik avatar Feb 07 '23 13:02 machenmusik

But another way to do it would be to shrink the webrtc view as well as the overlay, preserving aspect ratio.

Implemented a version of this, which I think looks pretty good. Feel free to merge if you agree.

machenmusik avatar Feb 07 '23 13:02 machenmusik

I think it is a cool idea, I'm just not sure Im ready to commit to it. My primary concerns are the following,

  • The logic for viewing should be made more general not specific. This is one reason I've started updating the colormap logic, so that it can render any data. Going forward I'm thinking we add handle higher dimensional data (>3 dim) with preset colormaps like PCA and clustering. This means improving the interface for colormap message passing. I worry that the logic gets a bit muddled when you have an output that "half" uses a colormap. Currently for example you can not change the colormap so the depth isn't meaningful (in the screenshot below).
  • Similar to the above, this assumes the user has an output that is named "rgb" and "depth". It could be confusing to method developers why this one specific case produces a new output, but other combos don't work.
  • It is odd that the interactivity only works for the rgb output, not the depth.
  • The large black crop at the bottom is distracting, it kinda of looks like a bug.
image

It's a bit of a divergence from what you had done so far, but one more general approach I can image is set up the viewer like this, image Where the user can set the left and right render choices. Each choice would also show the different colormap options.

tancik avatar Feb 07 '23 17:02 tancik

Unrelated (but I couldn't find a better place to message you), you have been adding a lot of great contributions to nerfstudio. If you shoot me an email to [email protected], we can open a more direct line of communication.

tancik avatar Feb 07 '23 17:02 tancik

Generally agree, some comments:

It's a bit of a divergence from what you had done so far, but one more general approach I can image is set up the viewer like this, where the user can set the left and right render choices. Each choice would also show the different colormap options.

Right, I think cropped splitview is a distinct but related idea, I was trying for more like video editor dual view where you want uncropped to see everything. (Specifically, so you can switch between single and dual without having to move the camera to see the same things.) I think both approaches have merit for slightly different use cases. What's here could probably be repurposed for either.

this assumes the user has an output that is named "rgb" and "depth". It could be confusing to method developers why this one specific case produces a new output, but other combos don't work.

Agreed, although RGBD is specifically interesting to render IMO.

The large black crop at the bottom is distracting, it kinda of looks like a bug.

Centering vertically so it looks like proper letterboxing should be straightforward, if that would help.

It is odd that the interactivity only works for the rgb output, not the depth.

Replace "rgb" with "left" and "depth" with "right". I think splitview might have the same oddity. Stretching interactivity across full width when the overlay isn't might feel strange in some cases, but would probably be fine for drag to pan/move etc.

machenmusik avatar Feb 07 '23 21:02 machenmusik

Right, I think cropped splitview is a distinct but related idea, I was trying for more like video editor dual view where you want uncropped to see everything. (Specifically, so you can switch between single and dual without having to move the camera to see the same things.) I think both approaches have merit for slightly different use cases. What's here could probably be repurposed for either.

What if we provide a slider where you can swipe between how much of each split you want to see. This would allow you do fairly easily see two different outputs frame the same viewpoint.

tancik avatar Feb 07 '23 22:02 tancik

For my use case, I need to see RGB and depth for the same region at once, so slider wouldn't work for that.

IMO the two most useful views are probably twin-scaled (what is currently implemented) and twin-cropped (which is what you first proposed).

May need some refactoring to support arbitrary twin views. Will give it more thought later.

machenmusik avatar Feb 07 '23 23:02 machenmusik

Upon further reflection... The PR changes already allow specifying a whole different set of

            reformatted_output: which output tensor to use
            output_type: which output type to use
            colormap_type: which colormap type to use
            colormap_normalize: whether to normalize the colormap
            colormap_invert: whether to invert the colormap
            colormap_range: colormap range to use, if any

so if someone wants to take a stab at UI for it and gets the values to the viewer, it should be trivial to wire up. I'd suggest starting with full splitview (just because it's here already), and maybe adding a checkbox for cropping instead.

machenmusik avatar Feb 08 '23 19:02 machenmusik

Currently for example you can not change the colormap so the depth isn't meaningful (in the screenshot below).

What I can do for now is unlock colormap settings when in this mode, which I think makes it more useful per discussion.

machenmusik avatar Feb 09 '23 15:02 machenmusik

Closing in favor of #1898

tancik avatar May 10 '23 18:05 tancik