SymbolicRegression.jl icon indicating copy to clipboard operation
SymbolicRegression.jl copied to clipboard

[DRAFT] Abstract Class for PopMember

Open atharvas opened this issue 8 months ago • 4 comments

This adds an abstract type AbstractPopMember. Right now, there is no clean way to store additional metadata that is relevant for calculating statistics of how each population is evolving. This PR serves as a first step towards a provenance framework. It enables subtyping the pop-member object to track additional information about each population member node.

Some issues (this PR should NOT be merged until we have concrete answers for these questions)

  • AbstractPopMember: Is this functionality desired in SR.jl?
  • The current code does not yield concrete types for many SR.jl functions (get_pareto_frontier, extract_from_worker, etc.). I've marked them as unstable right now. I'm hoping they don't have a big hit in benchmarking but if they do, then we'll need to take a closer look.
  • Alternatively, we can use the metadata entry in SymbolicExpression objects instead. This is slightly unwieldy as I'll need to unpack and repack the PopMember objects during the mutate! calls when the metadata is updated. I want to consider this before moving forward.

atharvas avatar Apr 21 '25 06:04 atharvas

Benchmark Results

master e1feb44d83bdfb... master / e1feb44d83bdfb...
search/multithreading 14.9 ± 0.51 s 19 ± 0.51 s 0.784
search/serial 26.7 ± 0.39 s 33.6 ± 0.13 s 0.794
utils/best_of_sample 1.54 ± 0.29 μs 1.88 ± 0.34 μs 0.819
utils/check_constraints_x10 11.7 ± 3.1 μs 11.8 ± 3.2 μs 0.991
utils/compute_complexity_x10/Float64 2.11 ± 0.14 μs 2.15 ± 0.15 μs 0.981
utils/compute_complexity_x10/Int64 2.07 ± 0.13 μs 2.06 ± 0.14 μs 1
utils/compute_complexity_x10/nothing 1.54 ± 0.14 μs 1.48 ± 0.16 μs 1.04
utils/insert_random_op_x10 4.93 ± 1.9 μs 5.64 ± 1.9 μs 0.874
utils/next_generation_x100 0.346 ± 0.018 ms 0.35 ± 0.026 ms 0.988
utils/optimize_constants_x10 0.0339 ± 0.0085 s 0.0347 ± 0.008 s 0.976
utils/randomly_rotate_tree_x10 5.29 ± 0.66 μs 5.36 ± 0.62 μs 0.987
time_to_load 2.24 ± 0.0055 s 2.35 ± 0.012 s 0.952

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

github-actions[bot] avatar Apr 21 '25 07:04 github-actions[bot]

  • AbstractPopMember: Is this functionality desired in SR.jl?

Sounds good to me!

  • The current code does not yield concrete types for many SR.jl functions (get_pareto_frontier, extract_from_worker, etc.). I've marked them as unstable right now. I'm hoping they don't have a big hit in benchmarking but if they do, then we'll need to take a closer look.

Hm. There's probably a way to fix this. @unstable should only be used as a last resort.

  • Alternatively, we can use the metadata entry in SymbolicExpression objects instead. This is slightly unwieldy as I'll need to unpack and repack the PopMember objects during the mutate! calls when the metadata is updated. I want to consider this before moving forward.

I think we should just pick whatever is most semantically correct, and make it work. For AbstractExpression objects, they tend to only hold information that is independent of the dataset and search. So this is why the loss and cost are stored in the PopMember - because they are properties of the dataset and search, and should be updated if you pass a new dataset in, or restart the search. But expression objects are kind of independent of that sorta thing.

So with this in mind, I guess PopMember makes more sense for the types of metadata you'd want to attach?

MilesCranmer avatar Apr 23 '25 16:04 MilesCranmer

Sounds good. I'll make the necessary changes and push again. I wasn't impressed with the benchmarking results either so I definitely need to rework this to remove the @unstable's.

atharvas avatar Apr 25 '25 02:04 atharvas

Ping on this; want any help just let me know

MilesCranmer avatar Jun 12 '25 16:06 MilesCranmer

Hey sorry for the delay on this. I'm going to restart work in this and hope to have it in a merge-able state end of the week.

atharvas avatar Aug 05 '25 19:08 atharvas

@MilesCranmer . Apologies but I don't think I'll have the bandwidth to finish this PR until the end of September (hopefully in time for v2.1, if I can't make v2.0).

atharvas avatar Aug 29 '25 21:08 atharvas

closing this in favor of #505

atharvas avatar Oct 10 '25 21:10 atharvas