Objective

Remove MeasureFunc and make custom measurements faster and mode intuitive

Fixed #57

Left:

[ ] add code docs and update changelog
[ ] make sure that names, implementation etc are aligned and agreed upon

Context

I decided to use the approach of storing custom data per measured element and providing an implementation of new Measure trait that has a method measure(&self, data: &StoredData, node: Node, constraint:Size<Option<f32>>) -> Size<f32>

Note that StoredData (just T in code) is going to be owned by Taffy and kept in SparseSecondaryMap. Usually it will be an id of some sort Measure trait implementation can understand.

For example in my project I need to provide measurements for text. Thus Measure trait will be implemented by Font struct and StoredData will be probably a tuple of (TextHandle, FontSize)

Note that StoredData can be an enum in case there are more than 1 type of objects that need to be measured

Feedback wanted

Names
- I'm not a native speaker and not sure that Measure.measure makes sense from lang point of view, would be awesome if somebody more competent would say "totally" or "no, here is a better name"
- I'm not happy with SImpleTaffy as an alias for Taffy<NoMeasure>. That might be important for demos and examples where majority of the cases do not involve custom measurements (thus, just Taffy might be better), on the other hand, in all more or less practical use cases there is a need for custom measurements (especially when dealing with text). Hence, it is possible that Taffy is a better alias for "unmeasured" case and something like MeasuredTaffy for "measured" case.
Architecture
- I don't see a problem with the current approach when it comes to multithreading, but maybe I'm missing something critical. StoredData is going to be owned by Taffy and it is up to the caller to ensure threadsafety for Measure trait implementation if compute_measured_layout needs to be called from a different thread, in other words, multithreading is possible and it is completely outside of the library's API surface area.
- I think with that approach there is no need for alloc feature, unless I'm missing something it was only needed for Boxed variant of MeasureFunc

Jul 10 '22 03:07 twop

cc @alice-i-cecile and @jkelleyrtp

Jul 10 '22 03:07 twop

If the direction feels right, I'm happy to polish the PR (add code docs, changelog and maybe more tests)

Jul 10 '22 03:07 twop

When I was thinking about ripping out measure_func, the API I had in mind was passing in a closure that returned a Size when requested of a Node.

fn compute_layout(&mut self, mut measure: impl FnMut(Node) -> Option<Size>);

The downside here is that we would be querying every node size, while here you have some memoization when the node is built. We could combine both, and only query the measure func when a node has been marked as "should be measured".

I personally am not a fan of more traits and making the top level Taffy instance generic. But I do really like the memoization thing you came up with.

Would a more passive API like this fit your use case? Are you able to link up the Taffy Node back to the Node from your tree?

Edit: Generally, Taffy shouldn't really be storing any of the Node data, and the fact that a measure func needs to be provided is evidence of the shortcoming with this method. In reality, we should let the developer provide their own tree and we just layout it for them.

Jul 10 '22 04:07 jkelleyrtp

Generally, Taffy shouldn't really be storing any of the Node data, and the fact that a measure func needs to be provided is evidence of the shortcoming with this method. In reality, we should let the developer provide their own tree and we just layout it for them.

Agreed here. This is discussed in #182.

Jul 10 '22 04:07 alice-i-cecile

Would a more passive API like this fit your use case? Are you able to link up the Taffy Node back to the Node from your tree?

Of course! because it will be a strictly better than passing all the context into a MeasureFunc::Boxed, however that will require keeping mapping of Node -> DataNeededToMeasure manually for all users. So I personally think it that will negatively impact ergonomics of the library and make it more error prone. For example: each users is responsible to remove Node from their map when element is removed from Taffy.

One dealbreaker for me though: users must be able to mix different measure function approaches within a single tree.

I'm not sure how this approach (keeping StoredData) will hinder that capability? Could you elaborate?

Jul 10 '22 04:07 twop

Generally, Taffy shouldn't really be storing any of the Node data

From my personal experience (including using Yoga) I always wanted to store some small metadata with the nodes I need to measure (usually that support Copy + Clone). For example: (TextHandle, FontSize, CanWrap) in some(most?) cases it can eliminate having to have custom mapping from Node -> (TextHandle, FontSize, CanWrap).

BUT, if this idea is controversial, I'm happy to rework the PR to remove storing StoredData. I think the ergonomics difference will be very observable as a diff in test files, so maybe it will be a good data point for the conversation

Jul 10 '22 04:07 twop

Would a more passive API like this fit your use case? Are you able to link up the Taffy Node back to the Node from your tree?

Of course! because it will be a strictly better than passing all the context into a MeasureFunc::Boxed, however that will require keeping mapping of Node -> DataNeededToMeasure manually for all users. So I personally think it that will negatively impact ergonomics of the library and make it more error prone. For example: each users is responsible to remove Node from their map when element is removed from Taffy.

One dealbreaker for me though: users must be able to mix different measure function approaches within a single tree.

I'm not sure how this approach (keeping StoredData) will hinder that capability? Could you elaborate?

I think an 80% answer would be to allow storing a u64-sized key on NodeData. If your node keys can fit into a pointer, then you can get the original key back when measuring.

For all of my projects, the keys are u64 sized. However, not everyone uses arenas - would storing a u64 key on NodeData be enough information to track down the original node from your tree?

Jul 10 '22 04:07 jkelleyrtp

My first read of the changes is that Taffy is now generic over T, where T: Measure. Because T can only ever be one concrete type at once, how can we store multiple different measure functions within the same structure?

Jul 10 '22 04:07 alice-i-cecile

My first read of the changes is that Taffy is now generic over T, where T: Measure. Because T can only ever be one concrete type at once, how can we store multiple different measure functions within the same structure?

Hm, let me try to to express what I had in mind.

Taffy<T> can be probably better named as Taffy<DataNeededForMeasurement>. E.g. when creating a tree the user defines which data they want to store with the leaf, as @jkelleyrtp mentioned it can be something as simple as u64.

Then when computing layout the user needs to provide a way to measure this node. Hence the measure method accepts (Node, Constraint, DataNeededForMeasurement).

My understanding that you see an issue when you need to measure several conceptually different types of nodes. For example text and images (for whatever reason you don't know the size of them in advance). Then you have several options.

There is one that I would probably use

enum DataNeededForMeasurement {
 Text(TextMetadata),
 Image(ImageMetadata)
}

or if you want to keep two separate HashMaps: Map<Node, TextMetadata> and Map<Node, ImageMetadata> and then you can just store the type with no extra metadata

enum NodeType {
 Text,
 Image
}

then you measure implementation might look like this:

fn measure(&self, node:Node, type: &NodeType, constraint:Size<Option<f32>>) -> Size<f32> {
  match type {
     NodeType::Text => measure_text(self.text_map.get(node)),
     NodeType::Image => measure_img(self.img_map.get(node))
  }
}

or

fn measure(&self, node:Node, type: &NodeType, constraint:Size<Option<f32>>) -> Size<f32> {
  match type {
     // note that you might not need to keep your own Map in this case
     NodeType::Text(text_metadata)=> measure_text(text_metadata),
     NodeType::Image(img_metadata) => measure_img(img_metadata)
  }
}

I hope that I understood you concern correctly.

Jul 10 '22 04:07 twop

Do you think this paradigm could fit your app?

So, instead of Taffy worrying about the DataToBeMeasured, it just lets you defer to your implementation through the supplied measurefunc?

It seems common that people want top-level structs to be generic over some type deep in the implementation. In reality, I've found that generic methods tend to be much easier to deal with than generic structs.

struct MyTree {
    my_nodes: HashMap<NodeId, NodeData>
}

fn main() {

    let ui = MyTree::new();
    let taffy = Taffy::new();

    loop {
        ui.tick();

        taffy.compute_measured_layout(

            // we get the taffy instance and the node id currently being measured
            |taffy, node| {
                // get the taffy's data for this node
                let data = taffy.get_data(node);

                // retrive the ID we use to track this node in our tree
                let my_ui_id = data.id.unwrap();

                // Chase down the node from our UI tree
                let my_ui_node = &ui.my_nodes[my_ui_id];

                // Measure it
                Some(my_ui_node.measure())
            }
        );

        ui.redraw();
    }
}

Jul 10 '22 04:07 jkelleyrtp

Do you think this paradigm could fit your app? So, instead of Taffy worrying about the DataToBeMeasured, it just lets you defer to your implementation through the supplied measurefunc?

Yes, of course! Although I hope I expressed my concern about having to force bookkeeping of my_nodes: HashMap<NodeId, NodeData> in a constructive way above.

Jul 10 '22 04:07 twop

Do you think this paradigm could fit your app? So, instead of Taffy worrying about the DataToBeMeasured, it just lets you defer to your implementation through the supplied measurefunc?

Yes, of course! Although I hoped I expressed my concern about having to force bookkeeping of my_nodes: HashMap<NodeId, NodeData> in a constructive way above

I think we're not on the same page here.

In this paradigm Taffy holds your id attached to Taffy's NodeData struct.

When taffy requests a specific node to be measured, it gives you your id back for you to find it within your tree. So Taffy is doing all the bookkeeping passively - we don't need the HashMap because your ID is already there when the node is requesting measurement.

Of course, this only works if either we 1) make Taffy generic over your node ID or 2) just assume everyone's ID fits into a u64.

The three most common allocators (Vec, Slab, Slotmap) all produce usize keys. Only generational arena produces u128 keys, I think.

Jul 10 '22 05:07 jkelleyrtp

Bevy would use Entity here, which can be stored as a u64; although we may need to make some changes downstream to make sure that the coercion required is public enough.

Jul 10 '22 05:07 alice-i-cecile

I think we're not on the same page here. In this paradigm Taffy holds your id attached to Taffy's NodeData struct.

You are right! I misunderstood you example, my apologies.

I think I would be personally rate this approach the last (although higher than MeasureFunc ^_^ ). For my project it won't give me a benefit over storing Node -> MyData myself. Partially because I want to be able to store more that 8 bytes and partially I don't want to do decoding/encoding my metadata into u64. Again, in my particular project it is not an id but a collection of small properties (id is one of them though). For example I keep font size separate from a string. Meaning that two strings with different styling, word wraps, colors, sizes will share the id.

Jul 10 '22 05:07 twop

Bevy would use Entity here, which can be stored as a u64; although we may need to make some changes downstream to make sure that the coercion required is public enough.

I think we're not on the same page here. In this paradigm Taffy holds your id attached to Taffy's NodeData struct.

You are right! I misunderstood you example, my apologies.

I think I would be personally rate this approach the last (although higher than MeasureFunc ^_^ ). For my project it won't give me a benefit over storing Node -> MyData myself. Partially because I want to be able to store more that 8 bytes and partially I don't want to do decoding/encoding my metadata into u64. Again, in my particular project it is not an id but a collection of small properties (id is one of them though).

How are your properties stored? It sounds like you don't have any Nodes - or at least they're not accessible from a key?

Jul 10 '22 05:07 jkelleyrtp

How are your properties stored? It sounds like you don't have any Nodes - or at least they're not accessible from a key?

I have the vdom/reconciler loosely inspired by https://github.com/fitzgen/dodrio. Thus I detect atomic changes in attributes and treat the last vdom as a source of truth for some properties(not all though). So I know exactly when they change, thus when rendering the tree I can skip traversing purely layout properties and focus only on rendering. I hope that makes sense

Jul 10 '22 05:07 twop

I'm sorry if I'm being too pedantic here, happy to change the code they way you think is best.

Jul 10 '22 05:07 twop

How are your properties stored? It sounds like you don't have any Nodes - or at least they're not accessible from a key?

Oh, maybe one more relevant data point for the conversation: I don't use taffy as a source of truth for my UI hierarchy. I have my own stateful UI tree that has both Rect and Node fields, also some UI nodes don't have their own layout (such as lines between elements). I traverse the UI tree 60 times a second but relayout happens relatively infrequently. Thus, when I compute layout I store the outcome(rects) in my data structure that is optimized for traversal in Rect field (fewer CPU cache misses).

Jul 10 '22 05:07 twop

How are your properties stored? It sounds like you don't have any Nodes - or at least they're not accessible from a key?

I have the vdom/reconciler loosely inspired by https://github.com/fitzgen/dodrio. Thus I detect atomic changes in attributes and treat the last vdom as a source of truth for some properties(not all though). So I know exactly when they change, thus when rendering the tree I can skip traversing purely layout properties and focus only on rendering. I hope that makes sense

Well that sounds a lot like the internals of Dioxus!

https://github.com/DioxusLabs/dioxus/tree/master/packages/core

Internally we use a slab that maps ID to raw pointer (which admittedly takes some unsafe). You might eventually do the same since it's useful to keep track of nodes across reconciliation.

But, I do understand the problem - your IDs aren't stable between VDom cycles since it sounds like you're using references directly, and old references become invalid after reconciliation.

I would imagine a stable ID system being more performant than Boxing measurefuncs.

For this type of usecase a "visitor" taffy would make the most sense. IE taffy doesn't store any data and defers to your implementation instead.

I think ultimately we would prefer to have a visitor approach than to make the Taffy instance generic over an external type.

Would it be reasonable to provide the SparseSecondaryMap on create/remove methods rather than built into Taffy directly?

This way, generics could be moved to methods and Taffy would optionally fill in the sparse map for you?

Edit:

Now that I think about it, what's stopping you from maintaining the SecondarySparseMap yourself?

When you add or remove nodes from taffy, update your secondary map, and then pass the secondarymap into the measure callback? It really sounds like the SecondaryMap is doing all the bookkeeping.

Jul 10 '22 05:07 jkelleyrtp

Now that I think about it, what's stopping you from maintaining the SecondarySparseMap yourself? When you add or remove nodes from taffy, update your secondary map, and then pass the secondarymap into the measure callback? It really sounds like the SecondaryMap is doing all the bookkeeping.

You are absolutely right. I tried to outline my thinking process about pros/cons of this approach in the comment above (https://github.com/DioxusLabs/taffy/pull/198#issuecomment-1179653586)

It seems you both perceive the cost of having Taffy to be generic is significantly higher than potential benefits. Let me try to iterate having

// note no "T" or any stored data in the signature
pub trait Measure {
    fn measure(&self, node: Node, constraint: Size<Option<f32>>) -> Size<f32>;
}

And see where it leads us.

Jul 10 '22 16:07 twop

Sounds good, I'm looking forward to seeing the results of the experiment!

Jul 10 '22 16:07 alice-i-cecile

Here is a draft how it might look like. I will outline a couple of moments in code inline comments

Jul 10 '22 17:07 twop

How is the measure data stored? I'm not sure I properly understand how or if it is stored at all. Does it not need to be stored?

With this approach measure data is managed completely outside of the library. E.g. a user has to store the mapping Map<Node, MeasureData> somewhere outside of the library.

The generic approach allowed to store data inside the library, hence the generic T that represents the type of the stored data.

My primary concern atm is the has_measure field. It does not seem to be tied to any kind of data, so we would have to manually keep it in sync. Is there any way we can remove the has_measure by checking if the node stores any kind of measure data?

In the current iteration the data is managed outside of the library, thus there is a need to mark the node that some measurements need to be provided (has_measure flag). On the other hand, if we do decided to store measure data inside the library, then there are only few nodes that require measurement. Let's assume that we decided to store u64 and represent it as Option<u64> then 1000 nodes will have at least 8Kb of memory overhead storing mostly None, that's why I decided to keep the measure data as a separate map.

Hope that context helps

Jul 12 '22 17:07 twop

Hope that context helps

That helps a lot! Now it makes sense 😄

Jul 13 '22 07:07 Weibye

Incidentally this is exactly the approach we use in morphorm for what I believe is the same thing (we call it content_size). It's just a method on a Node trait which queries for the content size. https://github.com/vizia/morphorm/blob/c14e888b1d66719fc3f13ccdb1ea01a10368ef9e/src/node.rs#L69

Jul 23 '22 23:07 geom3trik

I strongly prefer the content_size name; I'd like to align on that (and put in a doc alias for MeasureFunc and the JS equivalent).

Jul 23 '22 23:07 alice-i-cecile

I would like to raise the issue of caching here. I imagine a lot of measure_funcs will want to cache not just the overall result of the computation, but intermediate computations too (for example, a text layout engine may want to cache word-breaking or text shaping data). Does this single function approach still work if the called function will internally need to mutate? Are we just expecting the measure_func to do it's own internal synchronisation here if necessary?

Aug 19 '22 22:08 nicoburns

I would like to raise the issue of caching here. I imagine a lot of measure_funcs will want to cache not just the overall result of the computation, but intermediate computations too (for example, a text layout engine may want to cache word-breaking or text shaping data). Does this single function approach still work if the called function will internally need to mutate? Are we just expecting the measure_func to do it's own internal synchronisation here if necessary?

That is a very good point, my sketch does allow it conceptually but the type signature will needed to be slightly tweaked

// note adding "mut" keyword
pub trait Measure {
    fn measure(&mut self, node: Node, constraint: Size<Option<f32>>) -> Size<f32>;
}

Then the caching can be done inside the implementation of Measure trait

Aug 23 '22 17:08 twop

Hi all, I just took a look at the latest sources, and wanted to surface some thoughts/questions.

I like the Tree trait idea, seems like it can be useful to "embed" layout engine into a tree structure that most UI frameworks already have.
Taffy as a default implementation of it totally makes sense
I wonder what and if should be done with MeasureFunc given the new architectural advancements. On the one hand now a potential perf loss is more scoped and isolated (I can make my own Tree implementation), on the other hand, most usages (I assume) would just use default Taffy implementation.
So I wonder if the library should have a first class support for a different approach of custom measuring. It seems that the default implementation is not that much code to copy, but the api for manipulating children is robust, thus "copy pasting" it to make my own Tree implementation kinda feels off to me.

If there is an appetite for "4" (first class support of different measurement api for custom nodes), I'm happy to contribute, not exactly sure how the api might look like but happy to sketch out ideas if you think that there is value in that effort.

Jan 09 '23 22:01 twop

I'd be happy to see what those ideas look like :) The existing custom measurement stuff has always felt unclear and poorly designed to me, so I'd be very keen to see if you can come up with something better.

Jan 09 '23 22:01 alice-i-cecile

taffy
taffy copied to clipboard

Reworked custom measurements

Objective

Context

Feedback wanted

taffy taffy copied to clipboard

Reworked custom measurements

Objective

Context

Feedback wanted

taffy
taffy copied to clipboard