[DRAFT] Abstract Class for PopMember
This adds an abstract type AbstractPopMember. Right now, there is no clean way to store additional metadata that is relevant for calculating statistics of how each population is evolving. This PR serves as a first step towards a provenance framework. It enables subtyping the pop-member object to track additional information about each population member node.
Some issues (this PR should NOT be merged until we have concrete answers for these questions)
AbstractPopMember: Is this functionality desired in SR.jl?- The current code does not yield concrete types for many SR.jl functions (
get_pareto_frontier,extract_from_worker, etc.). I've marked them as unstable right now. I'm hoping they don't have a big hit in benchmarking but if they do, then we'll need to take a closer look. - Alternatively, we can use the
metadataentry inSymbolicExpressionobjects instead. This is slightly unwieldy as I'll need to unpack and repack the PopMember objects during themutate!calls when the metadata is updated. I want to consider this before moving forward.
Benchmark Results
| master | e1feb44d83bdfb... | master / e1feb44d83bdfb... | |
|---|---|---|---|
| search/multithreading | 14.9 ± 0.51 s | 19 ± 0.51 s | 0.784 |
| search/serial | 26.7 ± 0.39 s | 33.6 ± 0.13 s | 0.794 |
| utils/best_of_sample | 1.54 ± 0.29 μs | 1.88 ± 0.34 μs | 0.819 |
| utils/check_constraints_x10 | 11.7 ± 3.1 μs | 11.8 ± 3.2 μs | 0.991 |
| utils/compute_complexity_x10/Float64 | 2.11 ± 0.14 μs | 2.15 ± 0.15 μs | 0.981 |
| utils/compute_complexity_x10/Int64 | 2.07 ± 0.13 μs | 2.06 ± 0.14 μs | 1 |
| utils/compute_complexity_x10/nothing | 1.54 ± 0.14 μs | 1.48 ± 0.16 μs | 1.04 |
| utils/insert_random_op_x10 | 4.93 ± 1.9 μs | 5.64 ± 1.9 μs | 0.874 |
| utils/next_generation_x100 | 0.346 ± 0.018 ms | 0.35 ± 0.026 ms | 0.988 |
| utils/optimize_constants_x10 | 0.0339 ± 0.0085 s | 0.0347 ± 0.008 s | 0.976 |
| utils/randomly_rotate_tree_x10 | 5.29 ± 0.66 μs | 5.36 ± 0.62 μs | 0.987 |
| time_to_load | 2.24 ± 0.0055 s | 2.35 ± 0.012 s | 0.952 |
Benchmark Plots
A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).
AbstractPopMember: Is this functionality desired in SR.jl?
Sounds good to me!
- The current code does not yield concrete types for many SR.jl functions (
get_pareto_frontier,extract_from_worker, etc.). I've marked them as unstable right now. I'm hoping they don't have a big hit in benchmarking but if they do, then we'll need to take a closer look.
Hm. There's probably a way to fix this. @unstable should only be used as a last resort.
- Alternatively, we can use the
metadataentry inSymbolicExpressionobjects instead. This is slightly unwieldy as I'll need to unpack and repack the PopMember objects during themutate!calls when the metadata is updated. I want to consider this before moving forward.
I think we should just pick whatever is most semantically correct, and make it work. For AbstractExpression objects, they tend to only hold information that is independent of the dataset and search. So this is why the loss and cost are stored in the PopMember - because they are properties of the dataset and search, and should be updated if you pass a new dataset in, or restart the search. But expression objects are kind of independent of that sorta thing.
So with this in mind, I guess PopMember makes more sense for the types of metadata you'd want to attach?
Sounds good. I'll make the necessary changes and push again. I wasn't impressed with the benchmarking results either so I definitely need to rework this to remove the @unstable's.
Ping on this; want any help just let me know
Hey sorry for the delay on this. I'm going to restart work in this and hope to have it in a merge-able state end of the week.
@MilesCranmer . Apologies but I don't think I'll have the bandwidth to finish this PR until the end of September (hopefully in time for v2.1, if I can't make v2.0).
closing this in favor of #505