mojo
mojo copied to clipboard
[BUG] `print(os.environ)` segfaults in Mojo code
Bug description
SIGSEGV address 0x0 when trying to print the contents of `os.environ`
Steps to reproduce
run the following in a mojo cell
import os
print(os.environ)
note that the str-ified print(os.environ.__str__())
works fine
### System information
```shell
%%python
import subprocess
subprocess.Popen(["cat", "/proc/self/cgroup"])
11:hugetlb:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfa054673_2713_47b6_8007_15ab6c54e2f1.slice/cri-containerd-c5cc2ca6a9beff3089d578a7b9f79aeb77b51eb069c7a95b118e5a8c71887e3c.scope
10:memory:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfa054673_2713_47b6_8007_15ab6c54e2f1.slice/cri-containerd-c5cc2ca6a9beff3089d578a7b9f79aeb77b51eb069c7a95b118e5a8c71887e3c.scope
9:cpu,cpuacct:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfa054673_2713_47b6_8007_15ab6c54e2f1.slice/cri-containerd-c5cc2ca6a9beff3089d578a7b9f79aeb77b51eb069c7a95b118e5a8c71887e3c.scope
8:perf_event:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfa054673_2713_47b6_8007_15ab6c54e2f1.slice/cri-containerd-c5cc2ca6a9beff3089d578a7b9f79aeb77b51eb069c7a95b118e5a8c71887e3c.scope
7:devices:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfa054673_2713_47b6_8007_15ab6c54e2f1.slice/cri-containerd-c5cc2ca6a9beff3089d578a7b9f79aeb77b51eb069c7a95b118e5a8c71887e3c.scope
6:freezer:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfa054673_2713_47b6_8007_15ab6c54e2f1.slice/cri-containerd-c5cc2ca6a9beff3089d578a7b9f79aeb77b51eb069c7a95b118e5a8c71887e3c.scope
5:blkio:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfa054673_2713_47b6_8007_15ab6c54e2f1.slice/cri-containerd-c5cc2ca6a9beff3089d578a7b9f79aeb77b51eb069c7a95b118e5a8c71887e3c.scope
4:net_cls,net_prio:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfa054673_2713_47b6_8007_15ab6c54e2f1.slice/cri-containerd-c5cc2ca6a9beff3089d578a7b9f79aeb77b51eb069c7a95b118e5a8c71887e3c.scope
3:cpuset:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfa054673_2713_47b6_8007_15ab6c54e2f1.slice/cri-containerd-c5cc2ca6a9beff3089d578a7b9f79aeb77b51eb069c7a95b118e5a8c71887e3c.scope
2:pids:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfa054673_2713_47b6_8007_15ab6c54e2f1.slice/cri-containerd-c5cc2ca6a9beff3089d578a7b9f79aeb77b51eb069c7a95b118e5a8c71887e3c.scope
1:name=systemd:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podfa054673_2713_47b6_8007_15ab6c54e2f1.slice/cri-containerd-c5cc2ca6a9beff3089d578a7b9f79aeb77b51eb069c7a95b118e5a8c71887e3c.scope
0::/
This is a notebook-specific problem. I can repro with two cells:
%%python
import os
print(os.environ)
The second cell is Mojo. FYI @walter-erquinigo @River707
I couldn't repro the issue on top of tree. Let's see if this is still valid when staging is out
@walter-erquinigo Awesome. Thanks for taking a look. If it can't be repro'd after today's release, please close this issue. @strangemonad feel free to reopen if you're still hitting it afterwards
Just confirmed in the the release of the playground that this error doesn't happen anymore. I'll add an entry in the changelog for the next version
@walter-erquinigo confirmed it works in this morning's release in both python and mojo cells (with the expected exception that in a mojo cell you get <module 'os' from '/opt/conda/lib/python3.8/os.py'>
since there's no __str__
trait handling yet.
nit: I don't see it in the bug fixes in the release notes
I'll include a fix bug entry in the next release :)
Could you send me a repro for the issue you see with <module 'os'...>
?
hmm I just hit the segfault again but it's intermittent if I restart the kernel sometimes I hit it and sometimes I don't
Forget what I said about printing <module ...
I typo'ed print(os)
before having my morning coffee.
Here's what I'm running, sometimes it's working and sometimes it's hitting the null pointer. I restart the kernel each time.
#%%
%%python
import os
#%%
print(os.environ)
This is the contents of os.environ
environ({'JUPYTERHUB_API_TOKEN': '62e3b01e57f649b698641688ea4aca53', 'PROXY_API_PORT': 'tcp://172.20.41.207:8001', 'PROXY_API_PORT_8001_TCP_PORT': '8001', 'JUPYTERHUB_API_URL': 'http://hub:8081/hub/api', 'HUB_SERVICE_HOST': '172.20.23.124', 'JUPYTERHUB_OAUTH_ACCESS_SCOPES': '["access:[email protected]/", "access:[email protected]"]', 'MAINTENANCE_PORT': 'tcp://172.20.207.204:80', 'MAINTENANCE_SERVICE_PORT': '80', 'JUPYTERHUB_SERVICE_PREFIX': '/user/[email protected]/', 'KUBERNETES_SERVICE_HOST': '172.20.0.1', 'HUB_PORT_8081_TCP': 'tcp://172.20.23.124:8081', 'KUBERNETES_SERVICE_PORT': '443', 'PROXY_API_PORT_8001_TCP_ADDR': '172.20.41.207', 'MAINTENANCE_PORT_80_TCP': 'tcp://172.20.207.204:80', 'DEBIAN_FRONTEND': 'noninteractive', 'CPU_GUARANTEE': '1.5', 'KUBERNETES_PORT': 'tcp://172.20.0.1:443', 'HUB_PORT': 'tcp://172.20.23.124:8081', 'LANGUAGE': 'en_US.UTF-8', 'JUPYTERHUB_OAUTH_SCOPES': '["access:[email protected]/", "access:[email protected]"]', 'XDG_CACHE_HOME': '/home/jovyan/.cache/', 'PROXY_PUBLIC_SERVICE_HOST': '172.20.78.241', 'HUB_SERVICE_PORT_HUB': '8081', 'KUBERNETES_PORT_443_TCP_ADDR': '172.20.0.1', 'JPY_SESSION_NAME': 'c82da021-b145-4d32-84dd-7ca1e258b0c8', 'JUPYTERHUB_SERVICE_URL': 'http://0.0.0.0:8888/user/[email protected]/', 'JUPYTERHUB_HOST': '', 'MAINTENANCE_PORT_80_TCP_PORT': '80', 'PATH': '/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/.modular/.derived/build/lib:/.modular/.derived/third-party/llvm-project/build/bin:/.modular/.derived/build/lib', 'PROXY_PUBLIC_PORT_8002_TCP': 'tcp://172.20.78.241:8002', 'PROXY_PUBLIC_PORT_80_TCP': 'tcp://172.20.78.241:80', 'PROXY_PUBLIC_PORT': 'tcp://172.20.78.241:80', 'PROXY_PUBLIC_SERVICE_PORT_HTTP': '80', 'AWS_REGION': 'us-east-1', 'HUB_SERVICE_PORT': '8081', 'AWS_STS_REGIONAL_ENDPOINTS': 'regional', 'JUPYTERHUB_ACTIVITY_URL': 'http://hub:8081/hub/api/users/[email protected]/activity', 'JPY_PARENT_PID': '7', 'MODULAR_PATH': '/.modular', 'CONDA_DIR': '/opt/conda', 'AWS_WEB_IDENTITY_TOKEN_FILE': '/var/run/secrets/eks.amazonaws.com/serviceaccount/token', 'HUB_PORT_8081_TCP_PROTO': 'tcp', 'MEM_LIMIT': '4294967296', 'HOSTNAME': 'jupyter-shawnmorel-40gmail-2ecom', 'AWS_ROLE_ARN': 'arn:aws:iam::466483404629:role/mojo_hub-prod-us-east-1-user-pod-role-role-d7b68cc', 'PROXY_API_SERVICE_PORT': '8001', 'JUPYTER_IMAGE_SPEC': '466483404629.dkr.ecr.us-east-1.amazonaws.com/mojo_hub-dev-us-east-1-notebook:0fa4611c2cb3083dcf9c722d3ef22328a8583e84', 'PROXY_PUBLIC_PORT_8002_TCP_ADDR': '172.20.78.241', 'MAINTENANCE_PORT_80_TCP_ADDR': '172.20.207.204', 'PROXY_API_PORT_8001_TCP': 'tcp://172.20.41.207:8001', 'KUBERNETES_SERVICE_PORT_HTTPS': '443', 'SHELL': '/bin/bash', 'LANG': 'en_US.UTF-8', 'AWS_DEFAULT_REGION': 'us-east-1', 'MOJO_JUPYTER_LOG_FILE': '/mojo_logs.log', 'JUPYTER_IMAGE': '466483404629.dkr.ecr.us-east-1.amazonaws.com/mojo_hub-dev-us-east-1-notebook:0fa4611c2cb3083dcf9c722d3ef22328a8583e84', 'JUPYTERHUB_SERVER_NAME': '', 'LLVM_SYMBOLIZER_PATH': '/.modular/.derived/third-party/llvm-project/build/bin/llvm-symbolizer', 'PROXY_PUBLIC_SERVICE_PORT': '80', 'PROXY_PUBLIC_SERVICE_PORT_METRICS': '8002', 'KUBERNETES_PORT_443_TCP_PORT': '443', 'LC_ALL': 'en_US.UTF-8', 'MAINTENANCE_SERVICE_HOST': '172.20.207.204', 'HUB_PORT_8081_TCP_ADDR': '172.20.23.124', 'KUBERNETES_PORT_443_TCP_PROTO': 'tcp', 'KUBERNETES_PORT_443_TCP': 'tcp://172.20.0.1:443', 'PROXY_PUBLIC_PORT_80_TCP_PORT': '80', 'JUPYTERHUB_BASE_URL': '/', 'NB_USER': 'jovyan', 'HOME': '/home/jovyan', 'PROXY_API_SERVICE_HOST': '172.20.41.207', 'JUPYTERHUB_OAUTH_CALLBACK_URL': '/user/[email protected]/oauth_callback', 'MOJO_REPL_VERBOSE_LOG': '1', 'MAINTENANCE_PORT_80_TCP_PROTO': 'tcp', 'JUPYTERHUB_DEBUG': '1', 'JUPYTERHUB_CLIENT_ID': 'jupyterhub-user-shawnmorel%40gmail.com', 'MOJO_JUPYTER_JSON_LOGS': '1', 'PROXY_PUBLIC_PORT_8002_TCP_PROTO': 'tcp', 'PROXY_PUBLIC_PORT_8002_TCP_PORT': '8002', 'CPU_LIMIT': '30.0', 'NB_UID': '1000', 'PROXY_API_PORT_8001_TCP_PROTO': 'tcp', 'JUPYTERHUB_ADMIN_ACCESS': '1', 'NB_GID': '100', 'JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES': '[]', 'JPY_API_TOKEN': '62e3b01e57f649b698641688ea4aca53', 'JUPYTERHUB_USER': '[email protected]', 'JUPYTERHUB_DEFAULT_URL': '/lab/tree/HelloMojo.ipynb', 'HUB_PORT_8081_TCP_PORT': '8081', 'JUPYTERHUB_SINGLEUSER_APP': 'jupyter_server.serverapp.ServerApp', 'MEM_GUARANTEE': '1073741824', 'PROXY_PUBLIC_PORT_80_TCP_ADDR': '172.20.78.241', 'LD_LIBRARY_PATH': ':/.modular/.derived/third-party/llvm-project/build/lib:/opt/conda/lib/python3.8/site-packages:/opt/conda/lib:/.modular/.derived/build/lib', 'PROXY_PUBLIC_PORT_80_TCP_PROTO': 'tcp', 'PYTHONPATH': '/opt/conda/lib/python3.8/site-packages'})
Note that I'm only restarting the kernel, not the playground instance so I don't think it's flapping between new and old release. Seems like a legit race condition somewhere.
This is the pod that I'm on
11:hugetlb:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb2df5ea3_cdf1_4ecb_ad7a_930b5f5d762c.slice/cri-containerd-ca0fd58a69baced8f15d7533ba0d12c9a4f73f0173302ff650d35a1cfcf892e2.scope
10:memory:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb2df5ea3_cdf1_4ecb_ad7a_930b5f5d762c.slice/cri-containerd-ca0fd58a69baced8f15d7533ba0d12c9a4f73f0173302ff650d35a1cfcf892e2.scope
9:cpu,cpuacct:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb2df5ea3_cdf1_4ecb_ad7a_930b5f5d762c.slice/cri-containerd-ca0fd58a69baced8f15d7533ba0d12c9a4f73f0173302ff650d35a1cfcf892e2.scope
8:perf_event:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb2df5ea3_cdf1_4ecb_ad7a_930b5f5d762c.slice/cri-containerd-ca0fd58a69baced8f15d7533ba0d12c9a4f73f0173302ff650d35a1cfcf892e2.scope
7:devices:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb2df5ea3_cdf1_4ecb_ad7a_930b5f5d762c.slice/cri-containerd-ca0fd58a69baced8f15d7533ba0d12c9a4f73f0173302ff650d35a1cfcf892e2.scope
6:freezer:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb2df5ea3_cdf1_4ecb_ad7a_930b5f5d762c.slice/cri-containerd-ca0fd58a69baced8f15d7533ba0d12c9a4f73f0173302ff650d35a1cfcf892e2.scope
5:blkio:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb2df5ea3_cdf1_4ecb_ad7a_930b5f5d762c.slice/cri-containerd-ca0fd58a69baced8f15d7533ba0d12c9a4f73f0173302ff650d35a1cfcf892e2.scope
4:net_cls,net_prio:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb2df5ea3_cdf1_4ecb_ad7a_930b5f5d762c.slice/cri-containerd-ca0fd58a69baced8f15d7533ba0d12c9a4f73f0173302ff650d35a1cfcf892e2.scope
3:cpuset:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb2df5ea3_cdf1_4ecb_ad7a_930b5f5d762c.slice/cri-containerd-ca0fd58a69baced8f15d7533ba0d12c9a4f73f0173302ff650d35a1cfcf892e2.scope
2:pids:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb2df5ea3_cdf1_4ecb_ad7a_930b5f5d762c.slice/cri-containerd-ca0fd58a69baced8f15d7533ba0d12c9a4f73f0173302ff650d35a1cfcf892e2.scope
1:name=systemd:/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podb2df5ea3_cdf1_4ecb_ad7a_930b5f5d762c.slice/cri-containerd-ca0fd58a69baced8f15d7533ba0d12c9a4f73f0173302ff650d35a1cfcf892e2.scope
0::/
and the plot thickens a little. I now just hit a segfault for 0x2020
(the others had been null pointers). This definitely seems to point to a memory lifetime issue with print.
Thanks for the info. I'll take a look at it later
I was able to isolate this to the mojo compiler itself, including the python interop, and excluding the REPL. This is a minimal repro main.mojo file that fails
from IO import print
from PythonInterface import Python
fn main() raises:
let os = Python.import_module("os")
print(os.environ)
# works: print(os.environ.__str__())
I was able to repro this only on linux, not mac. It fails around 70% of the time.
@Mogball , could you please help retriaging this issue?
@walter-erquinigo I might be off on this but I believe macos terminal stdout is unbuffered by default which ends up being handled by this pseudo-buffered-io code path which forces a synchronous flush while you're still in the print call https://github.com/apple-open-source/macos/blob/5f8bb48acd35f4c7dff5ec06e521d324bdd93713/Libc/stdio/FreeBSD/xprintf.c#L999
I obviously can't try running this myself on macos yet but I'd try
- running the sample you gave and redirect stdout / stderr to a file
- That might still trip through the line-buffered handling logic though (most stdout flush on newlines) running through
unbuffered
or one of the suggestions here might help https://superuser.com/questions/191080/how-to-log-stdout-of-a-background-process-w-o-buffering-on-mac
My hunch is that something like the following is happening (wild conjecture on my part since I don't have src access)
- environ is converted to a string representation
- some variant of libc printf is called passing the strings underlying buffer (or a llvm string ref?)
- mojo does it's standard thing and puts a dealloc as soon as possible. In this case that would probably we right after the printf call.
- In the async case you've now got a race between printf and dealloc.
@stumpOS , you should check this out
@stumpOS bump