PySR
PySR copied to clipboard
Add Constant(s) Declaration to the PySRRegressor?
I experimented with PySR on Feynman equation 76 (e.g. qv/(2pi*r)) to see if it could 'learn' the constant 'Pi'? With the full data file but only 1,000 rows PySR generated:
import numpy as np
feynmanTable =np.loadtxt("/Users/davidlaxer/AI-Feynman/example_data/example_II.34.2a.txt")
input = feynmanTable[:,:3]
output = feynmanTable[:,-1]
model = PySRRegressor(
loss="loss(x, y) = (x - y)^2",
#loss="L1DistLoss()",
niterations=1000,
#niterations=10,
binary_operators=["+", "*", "^", "-", "/"],
unary_operators=["sin", "cos", "square", "log", "exp", "sqrt", "abs"],
extra_sympy_mappings={},
)
model.fit(X=input[:1000], y=output[:1000])
...
PySRRegressor.equations_ = [
pick score equation \
0 0.000000 0.56806177
1 0.232499 (1.3781261 / x2)
2 0.333228 (sqrt(x0) / x2)
3 0.314004 ((x0 / x2) * 0.47050688)
4 >>>> 15.328202 ((0.15915495 / (x2 / x1)) * x0)
5 0.063520 ((0.38872972 / abs((x2 / -0.40942314) / x1)) *...
6 0.033199 abs((abs(abs(0.17996888) / (x2 / x1)) * x0) / ...
loss complexity
0 1.959071e-01 1
1 1.230565e-01 3
2 8.818312e-02 4
3 6.441921e-02 5
4 3.126860e-15 7
5 2.584335e-15 10
6 2.418315e-15 12
]
With the feynman_problem.py interface:
model = PySRRegressor(loss="loss(x, y) = (x - y)^2",
niterations=1000,
binary_operators=["+", "*", "^", "-", "/"],
unary_operators=["sin", "cos", "square", "log", "exp", "sqrt", "abs"],
extra_sympy_mappings={},)
model.fit(problem.X, problem.y)
...
problem = problem_list[74]
problem
Feynman Equation: II.34.2a|Form: q*v/(2*pi*r)
run_on_problem(problem)
...
Cycles per second: 6.180e+04
Head worker occupation: 2.1%
Progress: 14908 / 15000 total iterations (99.387%)
==============================
Hall of Fame:
-----------------------------------------
Complexity Loss Score Equation
1 1.443e+02 4.228e-07 5.0480013
2 1.433e+02 7.515e-03 sqrt(x1)
3 6.734e+01 7.548e-01 (x0 / x2)
4 6.425e+01 4.697e-02 exp(4.2469196 - x2)
5 5.363e+01 1.808e-01 ((1.2044919 ^ x0) / x2)
6 2.026e+01 9.733e-01 abs(x1 / (x2 + -0.32283887))
7 5.886e-13 2.303e+01 (((x0 * 0.15915494) / x2) * x1)
18 3.238e-13 5.434e-02 abs((abs(abs(abs(abs(abs(abs(x0) * -0.39923882))) * -0.39864594)) / abs(abs(x2))) * x1)
('complexity 18\n
loss 0.0\n
score 0.054342\n
equation abs((abs(abs(abs(abs(abs(abs(x0) * -0.39923882...\n
sympy_format 0.159154934683391*Abs(x1*Abs(Abs(Abs(Abs(Abs(A...\nl
ambda_format PySRFunction(X=>0.159154934683391*Abs(x1*Abs(A...\n
Name: 7, dtype: object',
'q*v/(2*pi*r)',
{'time': 467.58346700668335,
'problem': Feynman Equation: II.34.2a|Form: q*v/(2*pi*r)})
I know I can add the number Pi as an additional column to the input data file. Do you think there would be any advantage(s) do allowing constants to be specified in the PySRRegressor?
I like that idea! Feel free to make a PR. Maybe something of the following form?
model = PySRRegressor(
complexity_of_constants=100 # to prevent PySR finding scalars
)
model.fit(X, y, constants={"pi": 3.14, "one": 1, "two": 2})
If not I could eventually work on this but might take some time.
I think when passing constants
like this, they would basically be added as additional columns to the input data X
. Previously you would have to manually add columns to X
with the constant value and set the variable_names
.