thundersvm
thundersvm copied to clipboard
How to configure n_jobs in SVC or RandomizedSearchCV?
I'm using CPU-only Thundersvm on a centos colony, each of the calculating node has 28 cores. But I got a problem when I try to optimize my parameters when using RandomizedSearchCV, beacuse I do not know how to set the 'n_jobs' value in SVC() and in RandomizedSearchCV(). The former seems to allow each svm_model to utlize n_jobs numbers of threads, and the latter seems to allow each parameters combination utlize n_jobs numbers of PIDs.
My problem is how to set these two n_jobs to aplly to my calculating node cores restrictions? And if these two n_jobs have different meanings? By the way, I noticed in another issue saying that ThunderSVM do not support gridsearchcv's n_jobs in skcit, is this true?
My codes are as follows: #ThunderSVM
parameters = { 'C': [1, 5, 9], 'gamma': [0.00001, 0.0001, 0.001, 0.1], 'kernel': ['rbf'] }
#I just set both of them into '5' svm_model = SVC(kernel='rbf', probability=False, random_state=42, max_iter=1000000, n_jobs = 5,verbose = 1)
rdsearch = RandomizedSearchCV(estimator = svm_model, param_distributions = parameters, n_iter = 10, cv = 3, n_jobs = 5,verbose = 1, random_state = 42)
#train model rdsearch.fit(X_train, y_train)
print(f"Best parameters: {rdsearch.best_params_}") print(f"Best score: {rdsearch.best_score_}")
#save model import joblib joblib.dump(rdsearch, "ThunderSVM_model.joblib")
I used to set it to the number of threads on my system (26 cores, 56 threads), and it managed to max out every core on my computer. The same event happened when setting n_jobs to the number of cores. Ultimately, it depends on the ability of your system to support heavy workloads because setting n_jobs to the maximum number of threads would likely result in I/O bandwidth bottleneck, then to CPU throttling, unless edit your system is capable enough. While setting it to the maximum number of cores is a safer option.