libsvm
libsvm copied to clipboard
svm_train() in svmutil.py is not thread-safe
To be more exact, svm_set_print_string_function() in svm.cpp is not thread-safe, as it modifies a global function pointer svm_print_string. If you call svm_train() in multiple Python threads with '-q' option, it will try to convert print_null() to a C-style function pointer and assign it to svm_print_string in order to suppress the output.
However, the function pointer generated each time seems to point to different addresses, which makes it possible to corrupt svm_print_string and crash the program.
I suggest to make the printing function pointer independent in different function calls. A global print option is neither thread-safe nor practical.
There are a few other places where libsvm isn't thread safe.. Some are because of historical reasons. We may handle some important ones in the near future, though the plan is uncertain yet..
On 2020-04-30 19:03, wyykak wrote:
To be more exact, svm_set_print_string_function() in svm.cpp is not thread-safe, as it modifies a global function pointer svm_print_string. If you call svm_train() in multiple Python threads with '-q' option, it will try to convert print_null() to a C-style function pointer and assign it to svm_print_string in order to suppress the output.
However, the function pointer generated each time seems to point to different addresses, which makes it possible to corrupt svm_print_string and crash the program.
I suggest to make the printing function pointer independent in different function calls. A global print option is neither thread-safe nor practical.
-- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub [1], or unsubscribe [2]. [ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/cjlin1/libsvm/issues/167", "url": "https://github.com/cjlin1/libsvm/issues/167", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]
Links:
[1] https://github.com/cjlin1/libsvm/issues/167 [2] https://github.com/notifications/unsubscribe-auth/ABI3BHR7IMVMNRPIPF4IARLRPFLH7ANCNFSM4MVNGHBA
Thanks for your reply! In fact I have tried to use libsvm together with multiprocessing in Python, but SVM models, as C structs, cannot be serialized and tranferred between processes. Currently libsvm only supports saving models as ascii files. Maybe you could add an interface to save models as strings? Therefore multiprocessing will be a possible way to parallelize libsvm.