SymbolicRegression.jl icon indicating copy to clipboard operation
SymbolicRegression.jl copied to clipboard

Evaluating trees with features not in dataset

Open Jgmedina95 opened this issue 3 years ago • 1 comments
trafficstars

Hi, im working on a process that uses the equation returned in the Pareto Frontier. I was playing around with it and found the following: Setting a tree like: image and evaluating with a dataset with 6 features, gives no problem, as expected. image

but if i create a tree that uses more features than in the original dataset like: image

I expected an error, but the function eval_tree_array gave an output and completion ==true. image I've realized this will not impact what I'm doing as trees made by the program will never (right?) have more features than the dataset, but I suppose its interesting enough of a bug to share.

Jgmedina95 avatar Sep 14 '22 18:09 Jgmedina95

Hi @Jgmedina95,

Thanks for raising this! So this should be because @inbounds is used throughout the evaluation to have faster compute. I make an in-bounds assumption since trees made during the search should never have different features than available in the dataset. However, as you raise, I don't think it hurts to check before running the evaluation, so maybe we could do that. (it could be that the user could save their search state, then re-run with a different dataset, and be surprised when trees access undefined memory?)

I just looked at the code, and it seems like sometimes @inbounds is actually not used, such as: https://github.com/MilesCranmer/SymbolicRegression.jl/blob/6075f13c3e8fbb8b16686a6f7c1157f0235174ee/src/EvaluateEquation.jl#L163

Whereas, @inbounds is used in the functions which fuse operators together, like here: https://github.com/MilesCranmer/SymbolicRegression.jl/blob/6075f13c3e8fbb8b16686a6f7c1157f0235174ee/src/EvaluateEquation.jl#L189

So perhaps for some trees, this would raise an error, and sometimes it would not. Will think more whether we should add a bounds check.

Best, Miles

MilesCranmer avatar Sep 14 '22 20:09 MilesCranmer