pymc-bart
pymc-bart copied to clipboard
Broken pipe failures during sampling on MacOS
Describe the bug
When sampling BART models on MacOS, I frequently (but not always) get broken pipe errors, presumably due to multiprocessing, towards the end of sampling runs.
PMB version: 0.5.7 PyMC version: 5.10.3 Python version: 3.10
Additional context
RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py", line 122, in run
self._start_loop()
File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py", line 174, in _start_loop
point, stats = self._step_method.step(self._point)
File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py", line 231, in step
point, sts = method.step(point)
File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py", line 100, in step
apoint, stats = self.astep(q)
File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py", line 293, in astep
self.bart.all_trees.append(self.all_trees)
File "<string>", line 2, in append
File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py", line 817, in _callmethod
conn.send((self._id, methodname, args, kwds))
File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py", line 410, in _send_bytes
self._send(buf)
File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py", line 373, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
"""
The above exception was the direct cause of the following exception:
BrokenPipeError Traceback (most recent call last)
File [~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:122](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:122), in run()
[121](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:121) self._point = self._make_numpy_refs()
--> [122](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:122) self._start_loop()
[123](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:123) except KeyboardInterrupt:
File [~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:174](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:174), in _start_loop()
[173](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:173) try:
--> [174](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:174) point, stats = self._step_method.step(self._point)
[175](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:175) except SamplingError as e:
File [~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py:231](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py:231), in step()
[230](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py:230) for method in self.methods:
--> [231](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py:231) point, sts = method.step(point)
[232](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py:232) stats.extend(sts)
File [~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:100](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:100), in step()
[98](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:98) q = DictToArrayBijection.map(var_dict)
--> [100](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:100) apoint, stats = self.astep(q)
[102](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:102) if not isinstance(apoint, RaveledVars):
[103](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:103) # We assume that the mapping has stayed the same
File [~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py:293](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py:293), in astep()
[292](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py:292) if not self.tune:
--> [293](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py:293) self.bart.all_trees.append(self.all_trees)
[295](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py:295) stats = {"variable_inclusion": variable_inclusion, "tune": self.tune}
File <string>:2, in append()
File [~/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py:817](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py:817), in _callmethod()
[815](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py:815) conn = self._tls.connection
--> [817](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py:817) conn.send((self._id, methodname, args, kwds))
[818](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py:818) kind, result = conn.recv()
File [~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:211](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:211), in send()
[210](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:210) self._check_writable()
--> [211](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:211) self._send_bytes(_ForkingPickler.dumps(obj))
File [~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:410](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:410), in _send_bytes()
[409](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:409) self._send(header)
--> [410](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:410) self._send(buf)
[411](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:411) else:
[412](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:412) # Issue #20540: concatenate before sending, to avoid delays due
[413](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:413) # to Nagle's algorithm on a TCP socket.
[414](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:414) # Also note we want to avoid sending a 0-length buffer separately,
[415](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:415) # to avoid "broken pipe" errors if the other end closed the pipe.
File [~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:373](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:373), in _send()
[372](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:372) while True:
--> [373](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:373) n = write(self._handle, buf)
[374](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:374) remaining -= n
BrokenPipeError: [Errno 32] Broken pipe
Note that this occurs even when running single chains, which is odd since there should be no multiprocessing going on. It appears that CompoundStep
uses multiprocessing even when there is a single chain.
Also occurs for Python 3.11
stale