rerun Implement graph components and archetypes

This implements basic graph primitives in the Rerun data model. This is a first step towards visualizing node-link diagrams in Rerun (related issue: #6898).

In addition to the changes to the data model, this PR adds two example crates, node_link_graph and graph_view to the Rust examples that show how these primitives can be used.

Design Decisions

Nodes and edges are stored as components and can be batched. To have a single node per entity we can use Rerun’s [clamping mechanism](https://rerun.io/docs/concepts/batches#component-clamping).
GraphNodeId is modeled as ~u32 to improve performance when using petgraph~ strings for better user experience. Internally we need to map to petgraph indices anyways.
Labels of the nodes can be set via the labels component and toggled via show_labels
Hierarchical graphs can be modeled through entity paths. For edges that cross entity boundaries we can insert dummy nodes to properly render subparts of the hierarchy.
By default, node ids will be local to the current entity. We provide functionality to link nodes of different entities by specifying the entity path of a node as a namespace.

TODOs

[x] ~Get rid of the Default derive for GraphNodeId and GraphEdge in the flatbuffer definitions.~
[x] Improve ergonomics for generating graph edges during logging.
[ ] Ensure that logging works from Python and C++ too.
[x] Fix remaining lints.

Checklist

[x] I have read and agree to Contributor Guide and the Code of Conduct
[ ] I've included a screenshot or gif (if applicable)
[ ] I have tested the web demo (if applicable):
- Using examples from latest main build: rerun.io/viewer
- Using full set of examples from nightly build: rerun.io/viewer
[x] The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
[ ] If applicable, add a new check to the release checklist!
[ ] If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

Sep 24 '24 11:09 grtlr

In this design, are the node id's global?

Sep 24 '24 12:09 nikolausWest

@nikolausWest Good point! So far I've made the assumption that all node IDs are global and can be referenced across entities to allow edges that connect nodes from different entities (similar to ClassId).

We could probably use a namespaced approach (e.g. based on entity path) by changing the way we gather the nodes from the entities that are currently being visualized. Then, we would have to store this entity information in the edges though.

However, it's probably simpler for the user to encode this information in the node IDs themselves. In the case of String IDs, this could be done by appending the entity path as a suffix. If we stick with IDs based on integers users could for example use factorizations.

Do you have a particular use case for namespaced node IDs in mind? How would you expect Rerun to behave in that case?

Sep 24 '24 13:09 grtlr

So far I've made the assumption that all node IDs are global and can be referenced across entities to allow edges that connect nodes from different entities (similar to ClassId).

Class Ids aren't actually global in the sense that to resolve a class id into e.g. a color, we walk up the graph to the first Annotation Context and use that to look up the value. That means these are actually scoped.

Do you have a particular use case for namespaced node IDs in mind? How would you expect Rerun to behave in that case?

I'm rather thinking about what happens if a user has multiple graphs. It would be a bit strange if edges started going between them because the user didn't correctly partition their node ids.

Sep 24 '24 15:09 nikolausWest

One option to consider here could be to have two types of edges. One that is within-entity edges (just a pair of node ids). The other could be between entity edges (destination entity, optional(source id), optional(destination id)). There might be several ways to model that but that would at least be the general idea

Sep 24 '24 15:09 nikolausWest

@nikolausWest Brief update:

Edges now have two additional attributes source_entity and target_entity as you described above. That way we can:

Make edges local to an entity by default to avoid collisions.
Allow linking between nodes of different entities.

The following shows a new toy example of the new logging API, which I also streamlined: https://github.com/rerun-io/rerun/pull/7500/files#diff-d054c306b388fcc1e8daf9f0477735519df3eeb486030979c62478b5d43404dcR36-R66.

I've also improved the debug graph viewer to show lists of nodes, edges, and their corresponding entity path. I'm currently working on getting overrides in the UI to work so that we can color nodes and edges to more easily understand the data model. This also helps me better understand the visualizer concepts.

I have one outstanding design decision:

The current systems allows for edges to live in completely unrelated parts of the entity hierarchy. This means that when the user choses to visualize a certain sub-entity not all edges are retrieved by default and the user has to manually specify the additional entities that contain the edges using Entity Path Filters:

+ /doors/**                <- contains the global edges
+ /hallway/areas/**        <- the actual entity that the user wants to visualize

Since this can be confusing due to the lack of discoverability, I think we should pull in global edges from outside the current hierarchy and visualizing them as dummy nodes. In the current design this forces us to iterate over all edges starting from the root. Should we mitigate this by introducing a new GraphCrossEntityEdge component, similar to what you described above?

Sep 25 '24 15:09 grtlr

@grtlr I actually think having edges on different entities than nodes is the main reason to go with your proposal. That allows users to put different meta data on edges and nodes which is a very important feature. If you can think of a way to still keep it more local that might be worth while.

Since this can be confusing due to the lack of discoverability, I think we should pull in global edges from outside the current hierarchy and visualizing them as dummy nodes. In the current design this forces us to iterate over all edges starting from the root.

I'm not completely sure about this. Maybe @Wumpf has some thoughts?

Sep 25 '24 19:09 nikolausWest

Haven't been following the entire discussion, but there's a lot of issues with pulling data from outside of a viewer's query/entity-pathfilter:

changes the query mechanism from going through a higher level abstraction to direct store queries
- we actually want to get to a place where we can predict the query of a view perfectly just from looking at how its configured, not knowing its specific type
- we're quite far from that and there's existing violations of that rule, 3d transforms being the most prominent one
blueprint can't be applied to everything outside the path filter
- blueprint overrides have no effect
- blueprint defaults on the View have no effect
if we pull data outside of the path filter, how to stop certain data from being ingested?
- control for that is blueprint
- does this cause a huge query?
- how do I display independent graphs in independent views?

Sep 26 '24 12:09 Wumpf

I think it's pretty clear we shouldn't include data from outside the entity path query. I think one question that leaves though is how to handle edges that are within the included entities but refer to nodes that are not. Perhaps some kind of greyed out nodes could make sense there (maybe even an option to include or exclude those)

Sep 26 '24 12:09 nikolausWest

Thank you @Wumpf for the clarification—that makes a lot of sense!

@nikolausWest I agree that we should show those as edges to some dummy nodes. In fact this is how I stumbled upon the above problem in the first place (undirected edges).

Sep 26 '24 14:09 grtlr

@emilk, @Wumpf: @nikolausWest asked me to post this here to open up the discussion:

I wanted to sync with you to map out the next steps of the graph implementation. It would be awesome if we could settle on a definition-of-done (DoD) for the initial version. I'm happy to add features and improve my work going forward, but I think having a milestone to work against will help a lot.

There are several areas that I've worked on and that will require future work:

Data model

Overall I think the data model is finished. Becaue of the use of flatbuffers we can easily add fields like edge and node weights in the future. There are currently some fields in the model like position that are not used, I would simply remove them before we merge the changes (depending on the DoD). If the current design is approved, I would start writing some helper functions for the Python and C++ logging libraries.

Graph Viewer

The basic graph viewer is working correctly. It has the following features:

Dragging of nodes
Can show directed/undirected edges (for now only straight lines)
Highlights nodes and selections
Toggle between node labels on/off.
Shows dummy nodes (nodes that are part of edges that live outside of the current selected entity hierarchy)

The code for the graph viewer still requires quite a bit of cleanup as I was still experimenting with the different layout algorithms, but I think I have a pretty good idea on what needs to be done—and I'm currently working on that.

Graph layouts

This is the biggest outstanding area, as the existing crates in the Rust ecosystem are all lacking—I implemented layout providers for all apart from graphviz-rust:

layout-rs (GraphViz-like layouts) does only expose the node positions and not the control points for the edges. We would have to fork this project.
graphviz-rust requires the GraphViz CLI to be present, which adds complexity to the build for native releases, and would require us to mainain a separate version for the web viewer. There is a WASM version of GraphViz but I still think this solution would have too many moving parts.
fdg and fdg-sim (force-based layouts): These are the most promising—but they still have some significant limitations for our use cases. The currently released version, fdg-sim is missing some flexibility (e.g. does not allow adding custom forces). There is an unreleased version, fdg, that fixes some of these shortcomings. However, dynamically adding and removing edges will force us to modify some of that libraries internal state. It is also missing a collision force as well as positioning forces (which we need for properly visualizing disjoint graphs). Another risk is, that there are not indications that the unreleased version will be released in the next time, as development has slowed down.

Unfortunately, I think the features provided by the existing crates don't really suffice for our needs.

Ideally we would want to have a Rust version of: https://d3js.org/d3-force. Porting it over should be pretty straight forward, as it is not too much code. To efficiently find collisions, we would also need a quadtree. As per @Wumpf we don't yet have an implementation in Rerun, and I don't know how good the existing implementations on crates.io are—do you have experience with quadtree crates in Rust?

Summary

Overall, I don't think it makes sense to merge the graph primitives without having at least one basic (but robust) layout algorithm implemented to visualize them—we just need to consider if it is worth adding this additional complexity to Rerun.

Oct 14 '24 13:10 grtlr

We extensively discussed the data model and the next steps with @grtlr this morning. Here is our proposal.

edit: proposal heavily discussed, amended, and ✨ decided ✨ with @jleibs

Data model

table GraphNode {

        // required

    /// List of nodes identified by a string.
    nodes: [rerun.components.Node]; // datatype: Utf8

    // Additional motivation for Utf8: could in the future encode "foreign" node (e.g.
    // "/other/graph#node1), enabling some kind of "hierarchy of graphs" display modelled after
    // the entity hierarchy.

    // Rejected alternative: use integer id instead of string: would be better performance-wise,
    // but that can be alleviated by using interned string, would force using a `Label` component
    // which would have to be unique and couldn't then be used for display purposes, decision made
    // based on feedback from ROSBAG person

        // future

    /// Override auto-layout position (great to debug force-based auto-layout algorithms!)
    positions: [rerun.components.Position2d];

    /// Used by (e.g. force-based) layout engine.
    ///
    /// May also be used for node size display (unless overridden by `node_radius`)
    weights: [rerun.components.Scalar];


    // Bunch of styling fields (blocked on tagged components)
    radii: [rerun.components.Radius];
    colors: [rerun.components.Color];
    labels: [rerun.components.Text];
    show_labels: rerun.components.ShowLabel;
}

table GraphEdge {

        // required

    /// Links between nodes (node will be auto-spawned if not explicitly logged in a `GraphNode` archetype.
	edges: [rerun.components.Edge]; // rerun.datatypes.StringPair { strings: [str; 2]; }

    // Rejected alternatives:
    // - Using two `directed_edges` and `undirected_edges` `Edge` fields: would allow heterogenous
    //   graphs, but very rare in the real world, complicated for most layout engine, for the rare case were
    //   it's needed, can easily be worked around on the display side using `edge_styles` (see below).
    // - Referring to node by index: very painful on logging end when you have dynamic graphs, where
    //   any node can appear/disappear.
    // - Any use of entity reference: unnecessary complexity, addressed by future, generic entity reference
    //   mechanism.
    // - Using adjacency list instead of edge: has advantage for large/dense graphs, but less ergonomic
    //   to log and can possibly be added in the future as an alternative recommended component
    // - Edge which include a `directed: bool` flag: we must be able to know very easily about homogeneity
    //   and graph kind because not all auto-layout are compatible with all kinds for graph

        // recommended

    /// Is the graph directed or undirected
    edge_type: rerun.components.GraphType: // enum { Directed, Undirected }

        // future

    /// How to draw the edge.
    ///
    /// Default to either `--` or `->` depending on `graph_type`.
    styles: [rerun.components.EdgeStyle]; // enum { --, ->, <-, in the future: o-, -o, ... }

    /// Used by force-based layout engine (blocked on tagged components)
    weight: [rerun.components.Scalar];

    colors: [rerun.components.Color];
    labels: [rerun.components.Text];
    show_labels: rerun.components.ShowLabel;   
}

Motivation for the 2-archetypes approach

So many were raised that I'm not sure I kept track of all of them.

Each archetype has a much saner instance join semantics
Each archetype can be usefully logged on its own (Node with weight -> bubble graph, Edge only: nodes with matching IDs are auto-spawned)
Implementation as of today is closer to this version.
Explicit, archetype grouping provides a much better avenue for displaying components (and their instances) in a sane manner in the UI.

Next steps

Validate internally the above data model
Scope down the present PR to a MVP:
- Implement the view as a first-class, in-repo view crate, but must be enabled using an experimental flag.
- Graph archetype only includes fields above the // Future line (critically, doesn't depend on tagged component)
- No node auto-layout: nodes to be arranged on a circle.
- No blueprint-API object/snippets/etc.
Open issues for everything else, improve iteratively in small PRs.

Oct 22 '24 08:10 abey79

Couldn't help myself from having a quick look - it looks promising!

Thanks for the comments on the PR! I'm currently incorporating your feedback and fixing some last things. If you want to try it out, you will need to active the Graph Space View experimental feature in the UI and then you can call cargo run -p node_link_graph -- --connect --example=lattice which is the most complete example for now.

Oct 24 '24 09:10 grtlr

@emilk I was not aware of nohash-hasher before—thank you for pointing it out!

I've implemented nohash_hasher::IsEnabled for NodeIndex:

https://github.com/grtlr/rerun/blob/111b0f58fb50b25d6c282f3453762724855fe795/crates/viewer/re_space_view_graph/src/graph/index.rs#L6-L20

We could also swich to write_usize for less risk for collisions, but I think this should be fine.

Oct 24 '24 10:10 grtlr

@abey79 @jleibs Just a heads up: The only real deviation from our design that I had to do was storing the edges in a table rather than a struct. Unfortunately I couldn't find a way to get the following to work:

namespace rerun.datatypes;

struct StringPair (
    "attr.rust.derive": "Default, PartialEq, Eq, PartialOrd, Ord",
    "attr.rust.custom_clause":
        'cfg_attr(feature = "serde", derive(::serde::Serialize, ::serde::Deserialize))'
) {
  /// The id of the source node.
  pair: [string: 2] (order: 100);
}

Which produces the following error:

error: structs may contain only scalar or struct fields

Oct 24 '24 13:10 grtlr

@abey79 @jleibs Just a heads up: The only real deviation from our design that I had to do was storing the edges in a table rather than a struct.

That's fine, my bad for seeding the wrong idea. This flatbuffer limitation actually has no impact whatsoever, since we are no actually using any flatbuffer. We use the language and flatc because it's convenient, but everything else is custom. The codegen'd stuff has no difference between struct and table (for those cases where flatc actually accepts struct).

Oct 24 '24 14:10 abey79

🥳

Nov 26 '24 08:11 nikolausWest