taffy
taffy copied to clipboard
Reworked custom measurements
Objective
Remove MeasureFunc and make custom measurements faster and mode intuitive
Fixed #57
Left:
- [ ] add code docs and update changelog
- [ ] make sure that names, implementation etc are aligned and agreed upon
Context
I decided to use the approach of storing custom data per measured element and providing an implementation of new Measure trait that has a method measure(&self, data: &StoredData, node: Node, constraint:Size<Option<f32>>) -> Size<f32>
Note that StoredData (just T in code) is going to be owned by Taffy and kept in SparseSecondaryMap. Usually it will be an id of some sort Measure trait implementation can understand.
For example in my project I need to provide measurements for text. Thus Measure trait will be implemented by Font struct and StoredData will be probably a tuple of (TextHandle, FontSize)
Note that StoredData can be an enum in case there are more than 1 type of objects that need to be measured
Feedback wanted
-
Names
- I'm not a native speaker and not sure that
Measure.measuremakes sense from lang point of view, would be awesome if somebody more competent would say "totally" or "no, here is a better name" - I'm not happy with
SImpleTaffyas an alias forTaffy<NoMeasure>. That might be important for demos and examples where majority of the cases do not involve custom measurements (thus, just Taffy might be better), on the other hand, in all more or less practical use cases there is a need for custom measurements (especially when dealing with text). Hence, it is possible thatTaffyis a better alias for "unmeasured" case and something likeMeasuredTaffyfor "measured" case.
- I'm not a native speaker and not sure that
-
Architecture
- I don't see a problem with the current approach when it comes to multithreading, but maybe I'm missing something critical.
StoredDatais going to be owned by Taffy and it is up to the caller to ensure threadsafety forMeasuretrait implementation ifcompute_measured_layoutneeds to be called from a different thread, in other words, multithreading is possible and it is completely outside of the library's API surface area. - I think with that approach there is no need for
allocfeature, unless I'm missing something it was only needed forBoxedvariant ofMeasureFunc
- I don't see a problem with the current approach when it comes to multithreading, but maybe I'm missing something critical.
cc @alice-i-cecile and @jkelleyrtp
If the direction feels right, I'm happy to polish the PR (add code docs, changelog and maybe more tests)
When I was thinking about ripping out measure_func, the API I had in mind was passing in a closure that returned a Size when requested of a Node.
fn compute_layout(&mut self, mut measure: impl FnMut(Node) -> Option<Size>);
The downside here is that we would be querying every node size, while here you have some memoization when the node is built. We could combine both, and only query the measure func when a node has been marked as "should be measured".
I personally am not a fan of more traits and making the top level Taffy instance generic. But I do really like the memoization thing you came up with.
Would a more passive API like this fit your use case? Are you able to link up the Taffy Node back to the Node from your tree?
Edit: Generally, Taffy shouldn't really be storing any of the Node data, and the fact that a measure func needs to be provided is evidence of the shortcoming with this method. In reality, we should let the developer provide their own tree and we just layout it for them.
Generally, Taffy shouldn't really be storing any of the Node data, and the fact that a measure func needs to be provided is evidence of the shortcoming with this method. In reality, we should let the developer provide their own tree and we just layout it for them.
Agreed here. This is discussed in #182.
Would a more passive API like this fit your use case? Are you able to link up the Taffy Node back to the Node from your tree?
Of course! because it will be a strictly better than passing all the context into a MeasureFunc::Boxed, however that will require keeping mapping of Node -> DataNeededToMeasure manually for all users. So I personally think it that will negatively impact ergonomics of the library and make it more error prone. For example: each users is responsible to remove Node from their map when element is removed from Taffy.
One dealbreaker for me though: users must be able to mix different measure function approaches within a single tree.
I'm not sure how this approach (keeping StoredData) will hinder that capability? Could you elaborate?
Generally, Taffy shouldn't really be storing any of the Node data
From my personal experience (including using Yoga) I always wanted to store some small metadata with the nodes I need to measure (usually that support Copy + Clone). For example: (TextHandle, FontSize, CanWrap) in some(most?) cases it can eliminate having to have custom mapping from Node -> (TextHandle, FontSize, CanWrap).
BUT, if this idea is controversial, I'm happy to rework the PR to remove storing StoredData. I think the ergonomics difference will be very observable as a diff in test files, so maybe it will be a good data point for the conversation
Would a more passive API like this fit your use case? Are you able to link up the Taffy Node back to the Node from your tree?
Of course! because it will be a strictly better than passing all the context into a
MeasureFunc::Boxed, however that will require keeping mapping ofNode -> DataNeededToMeasuremanually for all users. So I personally think it that will negatively impact ergonomics of the library and make it more error prone. For example: each users is responsible to removeNodefrom their map when element is removed from Taffy.One dealbreaker for me though: users must be able to mix different measure function approaches within a single tree.
I'm not sure how this approach (keeping
StoredData) will hinder that capability? Could you elaborate?
I think an 80% answer would be to allow storing a u64-sized key on NodeData. If your node keys can fit into a pointer, then you can get the original key back when measuring.
For all of my projects, the keys are u64 sized. However, not everyone uses arenas - would storing a u64 key on NodeData be enough information to track down the original node from your tree?
My first read of the changes is that Taffy is now generic over T, where T: Measure. Because T can only ever be one concrete type at once, how can we store multiple different measure functions within the same structure?
My first read of the changes is that
Taffyis now generic overT, whereT: Measure. BecauseTcan only ever be one concrete type at once, how can we store multiple different measure functions within the same structure?
Hm, let me try to to express what I had in mind.
Taffy<T> can be probably better named as Taffy<DataNeededForMeasurement>. E.g. when creating a tree the user defines which data they want to store with the leaf, as @jkelleyrtp mentioned it can be something as simple as u64.
Then when computing layout the user needs to provide a way to measure this node. Hence the measure method accepts (Node, Constraint, DataNeededForMeasurement).
My understanding that you see an issue when you need to measure several conceptually different types of nodes. For example text and images (for whatever reason you don't know the size of them in advance). Then you have several options.
There is one that I would probably use
enum DataNeededForMeasurement {
Text(TextMetadata),
Image(ImageMetadata)
}
or if you want to keep two separate HashMaps: Map<Node, TextMetadata> and Map<Node, ImageMetadata>
and then you can just store the type with no extra metadata
enum NodeType {
Text,
Image
}
then you measure implementation might look like this:
fn measure(&self, node:Node, type: &NodeType, constraint:Size<Option<f32>>) -> Size<f32> {
match type {
NodeType::Text => measure_text(self.text_map.get(node)),
NodeType::Image => measure_img(self.img_map.get(node))
}
}
or
fn measure(&self, node:Node, type: &NodeType, constraint:Size<Option<f32>>) -> Size<f32> {
match type {
// note that you might not need to keep your own Map in this case
NodeType::Text(text_metadata)=> measure_text(text_metadata),
NodeType::Image(img_metadata) => measure_img(img_metadata)
}
}
I hope that I understood you concern correctly.
Do you think this paradigm could fit your app?
So, instead of Taffy worrying about the DataToBeMeasured, it just lets you defer to your implementation through the supplied measurefunc?
It seems common that people want top-level structs to be generic over some type deep in the implementation. In reality, I've found that generic methods tend to be much easier to deal with than generic structs.
struct MyTree {
my_nodes: HashMap<NodeId, NodeData>
}
fn main() {
let ui = MyTree::new();
let taffy = Taffy::new();
loop {
ui.tick();
taffy.compute_measured_layout(
// we get the taffy instance and the node id currently being measured
|taffy, node| {
// get the taffy's data for this node
let data = taffy.get_data(node);
// retrive the ID we use to track this node in our tree
let my_ui_id = data.id.unwrap();
// Chase down the node from our UI tree
let my_ui_node = &ui.my_nodes[my_ui_id];
// Measure it
Some(my_ui_node.measure())
}
);
ui.redraw();
}
}
Do you think this paradigm could fit your app? So, instead of Taffy worrying about the DataToBeMeasured, it just lets you defer to your implementation through the supplied measurefunc?
Yes, of course! Although I hope I expressed my concern about having to force bookkeeping of my_nodes: HashMap<NodeId, NodeData> in a constructive way above.
Do you think this paradigm could fit your app? So, instead of Taffy worrying about the DataToBeMeasured, it just lets you defer to your implementation through the supplied measurefunc?
Yes, of course! Although I hoped I expressed my concern about having to force bookkeeping of
my_nodes: HashMap<NodeId, NodeData>in a constructive way above
I think we're not on the same page here.
In this paradigm Taffy holds your id attached to Taffy's NodeData struct.
When taffy requests a specific node to be measured, it gives you your id back for you to find it within your tree. So Taffy is doing all the bookkeeping passively - we don't need the HashMap because your ID is already there when the node is requesting measurement.
Of course, this only works if either we 1) make Taffy generic over your node ID or 2) just assume everyone's ID fits into a u64.
The three most common allocators (Vec, Slab, Slotmap) all produce usize keys. Only generational arena produces u128 keys, I think.
Bevy would use Entity here, which can be stored as a u64; although we may need to make some changes downstream to make sure that the coercion required is public enough.
I think we're not on the same page here. In this paradigm Taffy holds your id attached to Taffy's NodeData struct.
You are right! I misunderstood you example, my apologies.
I think I would be personally rate this approach the last (although higher than MeasureFunc ^_^ ). For my project it won't give me a benefit over storing Node -> MyData myself. Partially because I want to be able to store more that 8 bytes and partially I don't want to do decoding/encoding my metadata into u64. Again, in my particular project it is not an id but a collection of small properties (id is one of them though). For example I keep font size separate from a string. Meaning that two strings with different styling, word wraps, colors, sizes will share the id.
Bevy would use
Entityhere, which can be stored as au64; although we may need to make some changes downstream to make sure that the coercion required is public enough.
I think we're not on the same page here. In this paradigm Taffy holds your id attached to Taffy's NodeData struct.
You are right! I misunderstood you example, my apologies.
I think I would be personally rate this approach the last (although higher than
MeasureFunc^_^ ). For my project it won't give me a benefit over storingNode -> MyDatamyself. Partially because I want to be able to store more that 8 bytes and partially I don't want to do decoding/encoding my metadata intou64. Again, in my particular project it is not an id but a collection of small properties (id is one of them though).
How are your properties stored? It sounds like you don't have any Nodes - or at least they're not accessible from a key?
How are your properties stored? It sounds like you don't have any Nodes - or at least they're not accessible from a key?
I have the vdom/reconciler loosely inspired by https://github.com/fitzgen/dodrio. Thus I detect atomic changes in attributes and treat the last vdom as a source of truth for some properties(not all though). So I know exactly when they change, thus when rendering the tree I can skip traversing purely layout properties and focus only on rendering. I hope that makes sense
I'm sorry if I'm being too pedantic here, happy to change the code they way you think is best.
How are your properties stored? It sounds like you don't have any Nodes - or at least they're not accessible from a key?
Oh, maybe one more relevant data point for the conversation: I don't use taffy as a source of truth for my UI hierarchy. I have my own stateful UI tree that has both Rect and Node fields, also some UI nodes don't have their own layout (such as lines between elements). I traverse the UI tree 60 times a second but relayout happens relatively infrequently. Thus, when I compute layout I store the outcome(rects) in my data structure that is optimized for traversal in Rect field (fewer CPU cache misses).
How are your properties stored? It sounds like you don't have any Nodes - or at least they're not accessible from a key?
I have the vdom/reconciler loosely inspired by https://github.com/fitzgen/dodrio. Thus I detect atomic changes in attributes and treat the last vdom as a source of truth for some properties(not all though). So I know exactly when they change, thus when rendering the tree I can skip traversing purely layout properties and focus only on rendering. I hope that makes sense
Well that sounds a lot like the internals of Dioxus!
https://github.com/DioxusLabs/dioxus/tree/master/packages/core
Internally we use a slab that maps ID to raw pointer (which admittedly takes some unsafe). You might eventually do the same since it's useful to keep track of nodes across reconciliation.
But, I do understand the problem - your IDs aren't stable between VDom cycles since it sounds like you're using references directly, and old references become invalid after reconciliation.
I would imagine a stable ID system being more performant than Boxing measurefuncs.
For this type of usecase a "visitor" taffy would make the most sense. IE taffy doesn't store any data and defers to your implementation instead.
I think ultimately we would prefer to have a visitor approach than to make the Taffy instance generic over an external type.
Would it be reasonable to provide the SparseSecondaryMap on create/remove methods rather than built into Taffy directly?
This way, generics could be moved to methods and Taffy would optionally fill in the sparse map for you?
Edit:
Now that I think about it, what's stopping you from maintaining the SecondarySparseMap yourself?
When you add or remove nodes from taffy, update your secondary map, and then pass the secondarymap into the measure callback? It really sounds like the SecondaryMap is doing all the bookkeeping.
Now that I think about it, what's stopping you from maintaining the SecondarySparseMap yourself? When you add or remove nodes from taffy, update your secondary map, and then pass the secondarymap into the measure callback? It really sounds like the SecondaryMap is doing all the bookkeeping.
You are absolutely right. I tried to outline my thinking process about pros/cons of this approach in the comment above (https://github.com/DioxusLabs/taffy/pull/198#issuecomment-1179653586)
It seems you both perceive the cost of having Taffy to be generic is significantly higher than potential benefits. Let me try to iterate having
// note no "T" or any stored data in the signature
pub trait Measure {
fn measure(&self, node: Node, constraint: Size<Option<f32>>) -> Size<f32>;
}
And see where it leads us.
Sounds good, I'm looking forward to seeing the results of the experiment!
Here is a draft how it might look like. I will outline a couple of moments in code inline comments
How is the measure data stored? I'm not sure I properly understand how or if it is stored at all. Does it not need to be stored?
With this approach measure data is managed completely outside of the library. E.g. a user has to store the mapping Map<Node, MeasureData> somewhere outside of the library.
The generic approach allowed to store data inside the library, hence the generic T that represents the type of the stored data.
My primary concern atm is the
has_measurefield. It does not seem to be tied to any kind of data, so we would have to manually keep it in sync. Is there any way we can remove thehas_measureby checking if the node stores any kind of measure data?
In the current iteration the data is managed outside of the library, thus there is a need to mark the node that some measurements need to be provided (has_measure flag). On the other hand, if we do decided to store measure data inside the library, then there are only few nodes that require measurement. Let's assume that we decided to store u64 and represent it as Option<u64> then 1000 nodes will have at least 8Kb of memory overhead storing mostly None, that's why I decided to keep the measure data as a separate map.
Hope that context helps
Hope that context helps
That helps a lot! Now it makes sense 😄
Incidentally this is exactly the approach we use in morphorm for what I believe is the same thing (we call it content_size). It's just a method on a Node trait which queries for the content size. https://github.com/vizia/morphorm/blob/c14e888b1d66719fc3f13ccdb1ea01a10368ef9e/src/node.rs#L69
I strongly prefer the content_size name; I'd like to align on that (and put in a doc alias for MeasureFunc and the JS equivalent).
I would like to raise the issue of caching here. I imagine a lot of measure_funcs will want to cache not just the overall result of the computation, but intermediate computations too (for example, a text layout engine may want to cache word-breaking or text shaping data). Does this single function approach still work if the called function will internally need to mutate? Are we just expecting the measure_func to do it's own internal synchronisation here if necessary?
I would like to raise the issue of caching here. I imagine a lot of
measure_funcs will want to cache not just the overall result of the computation, but intermediate computations too (for example, a text layout engine may want to cache word-breaking or text shaping data). Does this single function approach still work if the called function will internally need to mutate? Are we just expecting themeasure_functo do it's own internal synchronisation here if necessary?
That is a very good point, my sketch does allow it conceptually but the type signature will needed to be slightly tweaked
// note adding "mut" keyword
pub trait Measure {
fn measure(&mut self, node: Node, constraint: Size<Option<f32>>) -> Size<f32>;
}
Then the caching can be done inside the implementation of Measure trait
Hi all, I just took a look at the latest sources, and wanted to surface some thoughts/questions.
- I like the Tree trait idea, seems like it can be useful to "embed" layout engine into a tree structure that most UI frameworks already have.
Taffyas a default implementation of it totally makes sense- I wonder what and if should be done with
MeasureFuncgiven the new architectural advancements. On the one hand now a potential perf loss is more scoped and isolated (I can make my own Tree implementation), on the other hand, most usages (I assume) would just use defaultTaffyimplementation. - So I wonder if the library should have a first class support for a different approach of custom measuring. It seems that the default implementation is not that much code to copy, but the api for manipulating children is robust, thus "copy pasting" it to make my own
Treeimplementation kinda feels off to me.
If there is an appetite for "4" (first class support of different measurement api for custom nodes), I'm happy to contribute, not exactly sure how the api might look like but happy to sketch out ideas if you think that there is value in that effort.
I'd be happy to see what those ideas look like :) The existing custom measurement stuff has always felt unclear and poorly designed to me, so I'd be very keen to see if you can come up with something better.