PySR
PySR copied to clipboard
Distributed PySR not working on PBS cluster
Hello Miles, thank you for you excelent work. I am interested in searching SR models in many fields of my research and in the past I used Eureqa (https://link.springer.com/article/10.1007/s10710-010-9124-z). Recently, I use your PySR (e.g. https://www.mdpi.com/2075-1680/11/9/463). I read your tips for the running PySR on the cluster and I tried it also on our cluster Barbora (https://www.it4i.cz/en) BUT there are many errors:
- Permission denied for: .julia/environments/v1.8/Project.toml - solved by export JULIA_DEPOT_PATH,JULIA_PROJECT and JULIA_LOAD_PATH to the local dir in scratch
- I couldn't run: pysr.install() from Python - solved by installing manualy directly in Julia by import Pkg; Pkg.add("SymbolicRegression")
- Now, I got this error which I couldn't resolve (I don't have any experiencies with Julia):
Error launching workers
ErrorException("")
Activating environment on workers.
Importing installed module on workers...Finished!
Testing module on workers...Finished!
Testing entire pipeline on workers...Finished!
Traceback (most recent call last):
File "/home/myname/myscript.py", line 49, in init
to the reducer
Please, do you have any tips to solve this error? What am I doing wrong? Thank you in advance! Best regards, Renata
Version: OS: Red Hat Enterprise Linux 8.4 (Ootpa) Julia 1.8.5 Python 3.10.8 PySR 0.12.1
I used this option settings - inspired by your advice: model = PySRRegressor( niterations=500000, population_size=108, binary_operators=["+", "*","/","^","-"], unary_operators=["abs","cos","log","exp","sin"], loss='L1DistLoss()', procs=36,cluster_manager='pbs', ncyclesperiteration=5000,turbo=True, maxdepth=7,parsimony=0.0001,weight_optimize=0.001,adaptive_parsimony_scaling=1000, nested_constraints={"sin": {"sin": 0, "cos": 0}, "cos": {"sin": 0, "cos": 0}} )
Hi @praksovar,
Everything looks good to me in your options.
- Can you share the full error message? If it is long perhaps you could put it in a gist.github.com?
- Is
procs=36
the number of cores over your entire allocation? Or is it the number of cores per node? (It should be # of cores over entire allocation. i.e., num_nodes * num_cores_per_node). - How are you launching this script - from the head node, or once per node? (It should just be launched from the head node; Julia will be able to create workers across the allocation)
Cheers, Miles
Hi @praksovar,
Just wanted to ping you on this. Please provide more details if possible so I can help fix it.
Cheers, Miles
Hi Miles, Thank you for you reply. The error which I asked you previously was solved by our support. The code is running correctly BUT only on 15-16 cores from the total 36 cores, 50% loads. I am using one node with 36 cores (ncpus=36).
The settings are as follows: model = PySRRegressor( niterations=50000, population_size=216, binary_operators=["+", "*","/","^","-"], unary_operators=["exp", "log",'abs'], loss='L1DistLoss()', multithreading=True, procs=36,cluster_manager="pbs",ncyclesperiteration=5000,turbo=True, maxdepth=7,parsimony=0.0001,weight_optimize=0.001,adaptive_parsimony_scaling=1000 )
So, I used your example which I ran in Python with PySR and also in Julia with SymbolicRegression.jl on our cluster. I found that Julia runs 36 cores whereas Python only 15-16 cores. Python: X = np.random.random((5, 100)) y = 2 * cos(X[4, :]) + X[1, :]** 2- 2 model = PySRRegressor(binary_operators=["+", "*","/","^","-"],unary_operators=["cos", "exp"],population_size=540,niterations=400,ncyclesperiteration=5000,turbo=True, multithreading=True) model.fit(X.T,y)
Julia: X = randn(Float32, 5, 100) y = 2 * cos.(X[4, :]) + X[1, :] .^ 2 .- 2 options = SymbolicRegression.Options( binary_operators=[+, *, /, -], unary_operators=[cos, exp], npopulations=540,ncyclesperiteration=5000,turbo=true) hall_of_fame = EquationSearch( X, y, niterations=40, options=options, parallelism=:multithreading )
Do you have any idea why? Thank you. Cheers, Renata
Hi @praksovar,
Sorry for the late reply. The issue is that you are using multithreading=True
. You need to have multithreading=False
for multiprocessing mode to be enabled.
Likewise in the pure Julia mode, you need to use parallelism=:multiprocessing, addprocs_function=addprocs_pbs
, rather than parallelism=:multithreading
.
Cheers, Miles