thundersvm icon indicating copy to clipboard operation
thundersvm copied to clipboard

How to configure n_jobs in SVC or RandomizedSearchCV?

Open ZYSK0 opened this issue 1 year ago • 1 comments

I'm using CPU-only Thundersvm on a centos colony, each of the calculating node has 28 cores. But I got a problem when I try to optimize my parameters when using RandomizedSearchCV, beacuse I do not know how to set the 'n_jobs' value in SVC() and in RandomizedSearchCV(). The former seems to allow each svm_model to utlize n_jobs numbers of threads, and the latter seems to allow each parameters combination utlize n_jobs numbers of PIDs.

My problem is how to set these two n_jobs to aplly to my calculating node cores restrictions? And if these two n_jobs have different meanings? By the way, I noticed in another issue saying that ThunderSVM do not support gridsearchcv's n_jobs in skcit, is this true?

My codes are as follows: #ThunderSVM

parameters = { 'C': [1, 5, 9], 'gamma': [0.00001, 0.0001, 0.001, 0.1], 'kernel': ['rbf'] }

#I just set both of them into '5' svm_model = SVC(kernel='rbf', probability=False, random_state=42, max_iter=1000000, n_jobs = 5,verbose = 1)

rdsearch = RandomizedSearchCV(estimator = svm_model, param_distributions = parameters, n_iter = 10, cv = 3, n_jobs = 5,verbose = 1, random_state = 42)

#train model rdsearch.fit(X_train, y_train)

print(f"Best parameters: {rdsearch.best_params_}") print(f"Best score: {rdsearch.best_score_}")

#save model import joblib joblib.dump(rdsearch, "ThunderSVM_model.joblib")

ZYSK0 avatar Aug 04 '24 13:08 ZYSK0

I used to set it to the number of threads on my system (26 cores, 56 threads), and it managed to max out every core on my computer. The same event happened when setting n_jobs to the number of cores. Ultimately, it depends on the ability of your system to support heavy workloads because setting n_jobs to the maximum number of threads would likely result in I/O bandwidth bottleneck, then to CPU throttling, unless edit your system is capable enough. While setting it to the maximum number of cores is a safer option.

DeltaGa avatar Jan 09 '25 23:01 DeltaGa