WolframClientForPython
WolframClientForPython copied to clipboard
SocketAborted problem
My Environment: Activated MMA 12.0 works correctly on my Ubuntu 16.04 system.
I install WolframClient with following code:
pip install wolframclient -i https://pypi.doubanio.com/simple
Then I execute following codes in python 3.6.10:
from wolframclient.evaluation import WolframLanguageSession
sess = WolframLanguageSession(path)
sess.start()
where, the path I try different values like the result of :
$ which WolframKernel
in my case, it's '/usr/local/bin/WolframKernel' , or the output of
First@$CommandLine
which is '/usr/local/Wolfram/Mathematica/12.0/SystemFiles/Kernel/Binaries/Linux-x86-64/WolframKernel' or use this path '/usr/local/Wolfram/Mathematica/12.0/Executables/WolframKernel' which is default path of WolframLanguageSession().
Then errors are vary when I choose different paths, but can be concluded into 2 types.
- When path = '/usr/local/Wolfram/Mathematica/12.0/SystemFiles/Kernel/Binaries/Linux-x86-64/WolframKernel' , it immediately returns following errors info:
Socket exception: Socket operation aborted.
Failed to start.
---------------------------------------------------------------------------
SocketAborted Traceback (most recent call last)
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/kernelcontroller.py in _kernel_start(self)
436 timeout=self.get_parameter("STARTUP_TIMEOUT"),
--> 437 abort_event=_StartEvent(self.kernel_proc, self.trigger_termination_requested),
438 )
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/zmqsocket.py in recv_abortable(socket, timeout, abort_check_period, abort_event, **kwargs)
49 if abort_event.is_set():
---> 50 raise SocketAborted("Socket operation aborted.")
51 if timeout and (time.perf_counter() - start > timeout):
SocketAborted: Socket operation aborted.
During handling of the above exception, another exception occurred:
WolframKernelException Traceback (most recent call last)
<ipython-input-4-84c234a8332d> in <module>
----> 1 sess.start()
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/localsession.py in start(self, block, timeout)
155 self.terminate()
156 finally:
--> 157 raise e
158
159 def start_future(self):
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/localsession.py in start(self, block, timeout)
150 future = self.start_future()
151 if future and block:
--> 152 future.result(timeout=timeout)
153 except Exception as e:
154 try:
~/anaconda3/envs/detectron2/lib/python3.6/concurrent/futures/_base.py in result(self, timeout)
430 raise CancelledError()
431 elif self._state == FINISHED:
--> 432 return self.__get_result()
433 else:
434 raise TimeoutError()
~/anaconda3/envs/detectron2/lib/python3.6/concurrent/futures/_base.py in __get_result(self)
382 def __get_result(self):
383 if self._exception:
--> 384 raise self._exception
385 else:
386 return self._result
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/kernelcontroller.py in run(self)
525 # Kernel start requested.
526 if payload is self.START:
--> 527 self._safe_kernel_start()
528 future.set_result(True)
529 future = None
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/kernelcontroller.py in _safe_kernel_start(self)
364 self._kernel_terminate()
365 finally:
--> 366 raise e
367
368 _KERNEL_OK = b"OK"
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/kernelcontroller.py in _safe_kernel_start(self)
358 """ Start a kernel. If something went wrong, clean-up resources that may have been created. """
359 try:
--> 360 self._kernel_start()
361 except Exception as e:
362 logger.warning("Failed to start.")
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/kernelcontroller.py in _kernel_start(self)
454 logger.warning("Socket exception: %s", se)
455 raise WolframKernelException(
--> 456 "Failed to communicate with kernel: %s." % self.kernel
457 )
458
WolframKernelException: Failed to communicate with kernel: [path].
- When path != '/usr/local/Wolfram/Mathematica/12.0/SystemFiles/Kernel/Binaries/Linux-x86-64/WolframKernel', it will wait for several seconds and return following errors:
Socket exception: Failed to read any message from socket tcp://127.0.0.1:39963 after 20.1 seconds and 200 retries.
Failed to start.
---------------------------------------------------------------------------
SocketOperationTimeout Traceback (most recent call last)
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/kernelcontroller.py in _kernel_start(self)
436 timeout=self.get_parameter("STARTUP_TIMEOUT"),
--> 437 abort_event=_StartEvent(self.kernel_proc, self.trigger_termination_requested),
438 )
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/zmqsocket.py in recv_abortable(socket, timeout, abort_check_period, abort_event, **kwargs)
54 "Failed to read any message from socket %s after %.1f seconds and %i retries."
---> 55 % (socket.uri, time.perf_counter() - start, retry)
56 )
SocketOperationTimeout: Failed to read any message from socket tcp://127.0.0.1:39963 after 20.1 seconds and 200 retries.
During handling of the above exception, another exception occurred:
WolframKernelException Traceback (most recent call last)
<ipython-input-3-84c234a8332d> in <module>
----> 1 sess.start()
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/localsession.py in start(self, block, timeout)
155 self.terminate()
156 finally:
--> 157 raise e
158
159 def start_future(self):
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/localsession.py in start(self, block, timeout)
150 future = self.start_future()
151 if future and block:
--> 152 future.result(timeout=timeout)
153 except Exception as e:
154 try:
~/anaconda3/envs/detectron2/lib/python3.6/concurrent/futures/_base.py in result(self, timeout)
430 raise CancelledError()
431 elif self._state == FINISHED:
--> 432 return self.__get_result()
433 else:
434 raise TimeoutError()
~/anaconda3/envs/detectron2/lib/python3.6/concurrent/futures/_base.py in __get_result(self)
382 def __get_result(self):
383 if self._exception:
--> 384 raise self._exception
385 else:
386 return self._result
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/kernelcontroller.py in run(self)
525 # Kernel start requested.
526 if payload is self.START:
--> 527 self._safe_kernel_start()
528 future.set_result(True)
529 future = None
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/kernelcontroller.py in _safe_kernel_start(self)
364 self._kernel_terminate()
365 finally:
--> 366 raise e
367
368 _KERNEL_OK = b"OK"
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/kernelcontroller.py in _safe_kernel_start(self)
358 """ Start a kernel. If something went wrong, clean-up resources that may have been created. """
359 try:
--> 360 self._kernel_start()
361 except Exception as e:
362 logger.warning("Failed to start.")
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/kernelcontroller.py in _kernel_start(self)
454 logger.warning("Socket exception: %s", se)
455 raise WolframKernelException(
--> 456 "Failed to communicate with kernel: %s." % self.kernel
457 )
458
WolframKernelException: Failed to communicate with kernel: [path]
Some preliminary questions:
- What version of the library do you grab from the package repository you specified?
- Can you run the executables from a terminal and get a prompt? (
In[1]:=
)
Answer:
- The version of wolframclient library is 1.1.2.
- Yes. Every executable works well from all different paths.
By the way, my windows10_x86_64 laptop works well with MMA 11.3 and 1.1.2 library. On my laptop, I executed wl.NSolve('- 2*x + 3 == 0') expecting a "x=1.5" result, but only got a empty tuple. What's the usage of this NSolve function in python?
Can you provide a minimal example script that trigger the errors. From what I see above no code snippet is a working example (e.g. it is missing path).
Also make sure to close session or use with
when initializing sessions otherwise orphan kernels are spawned.
Below two examples will give errors in one of 2 types depending on the path I choose. minimal example1:
with WolframLanguageSession(path) as sess:
sess.start()
minimal example2:
with WolframLanguageSession(path) as sess:
sess.evaluate(wl.range(5))
Interestingly and strangely, I install MMA 12.0 and 1.1.2 library in docker container. In the container, minimal example scripts work well.
These are not minimal working example scripts. Think of a piece of code that could work if stored in a file and run using:
$ python3 example.py
For example they both miss the import statements.
Thanks.
These are not minimal working example scripts. Think of a piece of code that could work if stored in a file and run using
It's difficult to find the core part that generates the problem. But I tried to figure out it and found some clues.
Type 1 error occurs because abort_event.is_set() is True
,where is_set()
function will return self.subprocess.poll() is not None or self.abort_event.is_set()
. I found that self.abort_event.is_set()
is False, self.subprocess.poll()
is 127. So it is self.subprocess
that causes the problem.
Then I tried to find where the self.subprocess
value is assigned. I found that in the 419 line of file
~/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/kernelcontroller.py
in function _kernel_start()
, there is a code block which assigns value to self.kernel_proc first and assgins value to self.subprocess later:
self.kernel_proc = Popen(cmd,
stdin=self._stdin,
stdout=self._stdout,
stderr=self._stderr,
startupinfo=startupinfo,)
so, I print out arguments that convey to the Popen(): cmd:
['/usr/local/Wolfram/Mathematica/12.0/SystemFiles/Kernel/Binaries/Linux-x86-64/WolframKernel', '-noprompt', '-initfile', '/home/[my_user_name]/anaconda3/envs/detectron2/lib/python3.6/site-packages/wolframclient/evaluation/kernel/initkernel.m', '-run', 'ClientLibrary`Private`SlaveKernelPrivateStart["tcp://127.0.0.1:41535", "tcp://127.0.0.1:40675"];']
stdin: -1, stdout: -1, stderr: -1, startupinfo: None
Then, 437 line in file kernelcontroller.py, self.kernel_proc
will convey to self.subprocess
by following codes:
abort_event=_StartEvent(self.kernel_proc, self.trigger_termination_requested)
, where self
refers to class _StartEvent
. But strangely, I found that the value of self.kernel_proc
here is still None before conveying!
Therefore, I dug into function recv_abortable
in file zmqsocket.py, to find out exactly when self.subprocess.poll()
changes to 127. I changed the codes block of recv_abortable
into
def recv_abortable(
socket, timeout=None, abort_check_period=0.1, abort_event=None, **kwargs
):
print(abort_event.is_set()) # print False
if not socket.bound:
raise SocketException("ZMQ socket not bound.")
print(abort_event.is_set()) # print True !!
if timeout and timeout < 0:
raise ValueError("Timeout must be a positive number.")
It seems that after executing socket.bound
, self.subprocess.poll()
changes to 127!
Hope it will help!
I think I have the same or similar issue on Wolfram Client Library version 1.1.4. I described this in this StackExchange post. My technical knowledge is limited but if there is a way to solve this please let me know.
It's not possible to debug issues without a minimal runnable script which hasn't been provided. As stated on the above mentioned StackExchange post, it's possible that some orphan kernels exist.
Use ps -eaf | grep "WolframKernel.*SlaveKernelPrivateStart" | grep -v grep
to find them. If you have none, then please provide a working minimal example so I can try to understand what could go wrong.
Thanks.
Facing the same issue on Ubuntu 20.04 with wolframclient==1.1.6 installed in a conda environment.
I cannot provide a minimal working example since nothing works, so here is my minimal crashing example:
from wolframclient.evaluation import WolframLanguageSession
with WolframLanguageSession() as session:
pass
This prints
Socket exception: Socket operation aborted.
Failed to start.
and the handling of that exception throws
Failed to communicate with kernel: /usr/local/Wolfram/Mathematica/12.1/Executables/WolframKernel.
Your explanation with the orphan kernels seems plausible since it worked once in the beginning, but the problem persists after reboot.
ps -eaf | grep "WolframKernel.*SlaveKernelPrivateStart" | grep -v grep
does not find anything.
Here are some investigations to try narrow the source of bugs. The problem you're facing is that the kernel process ends abruptly. @MaxKinny that's why self.subprocess.poll()
has a value. Whereas it should be None
as long as the process is running (see doc).
Could you setup Python logging to see if there is anything useful printed using:
import logging
logging.basicConfig(level=logging.DEBUG)
It'll print to stdout. You should see the command used to start the WolframLanguage kernel. On my machine I get something like this:
DEBUG:wolframclient.evaluation.kernel.kernelcontroller:Kernel called using command: /Applications/Mathematica.app/Contents/MacOS/WolframKernel -noprompt -initfile /Users/dorianb/.../WolframClientForPython/wolframclient/evaluation/kernel/initkernel.m -run ClientLibrary`Private`KernelPrivateStart["tcp://127.0.0.1:57055", "tcp://127.0.0.1:57057"];.
INFO:wolframclient.evaluation.kernel.kernelcontroller:Kernel process started with PID: 33202
Note that this is the code for wolframclient 1.1.6
. I'll use it later, and you'd need to use the one printed on your machine, if running an older version of the library.
Using this command, you can try to mock the python side with Wolfram Language and see if the problem remains.
In a fresh kernel, open two ZMQ sockets with appropriate types:
Needs["ZeroMQLink`"];
out = SocketOpen[Automatic, "ZMQ_PUSH"]
in = SocketOpen[Automatic, "ZMQ_PULL"]
portout = out["DestinationPort"]
portin = in["DestinationPort"]
Now edit the starting command with the above ports. In my case the command becomes:
/Applications/Mathematica.app/Contents/MacOS/WolframKernel -noprompt -initfile /Users/dorianb/.../WolframClientForPython/wolframclient/evaluation/kernel/initkernel.m -run 'ClientLibrary`Private`KernelPrivateStart["tcp://127.0.0.1:53004", "tcp://127.0.0.1:53008"];'
And run it in a terminal. Hopefully you'll see an error in your terminal that will help debugging your issue.
While the above runs, go back to the kernel and run the following. First read the "OK" string returned when the process has started:
ByteArrayToString[SocketReadMessage[in]] === "OK"
If you get this it means that the kernel initialization was successful. I doubt you'll receive the message. But if you do, you can also try to evaluate something using:
ZMQSocketWriteMessage[out, BinarySerialize[{$MachineName, Now}]]
wxf = SocketReadMessage[in]
Also for the record, I just was able to run a 12.0
kernel using the version 1.1.6
of the library, but on MacOS, not Ubuntu.
import logging logging.basicConfig(level=logging.DEBUG)
Used this code to help debug and realized that one of my Wolfram folders had the wrong permissions on Ubuntu. Must have accidentally run as root somewhere. Thanks for that tip!
I'm glad you were able to track this down. Do you mind telling me what type of folder it was? I may add some pre-flight checks to the library so that it provides better error messages.
I'm glad you were able to track this down. Do you mind telling me what type of folder it was? I may add some pre-flight checks to the library so that it provides better error messages.
/home/ubuntu/.cache/Wolfram/WolframScript
accidentally belonged to root.