austin-python icon indicating copy to clipboard operation
austin-python copied to clipboard

mojo2austin expecting utf-8, found latin-1

Open dooferlad opened this issue 5 months ago • 1 comments

Description

Running mojo2austin on a file I just generated gives an error:

Traceback (most recent call last):
  File "/home/dooferlad/.venv/bin/mojo2austin", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/dooferlad/.venv/lib/python3.11/site-packages/austin/format/mojo.py", line 541, in main
    for event in MojoFile(mojo).parse():
  File "/home/dooferlad/.venv/lib/python3.11/site-packages/austin/format/mojo.py", line 507, in parse
    for e in self.parse_event():
  File "/home/dooferlad/.venv/lib/python3.11/site-packages/austin/format/mojo.py", line 492, in parse_event
    for event in t.cast(dict, self.__handlers__)[event_id](self):
  File "/home/dooferlad/.venv/lib/python3.11/site-packages/austin/format/mojo.py", line 469, in parse_string
    value = self.read_string()
            ^^^^^^^^^^^^^^^^^^
  File "/home/dooferlad/.venv/lib/python3.11/site-packages/austin/format/mojo.py", line 331, in read_string
    return self.read_until().decode()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 1: invalid start byte

Steps to Reproduce

  1. Run a Django project: austin --output /home/dooferlad/supportsite.austin --binary --heap=2048 ./manage.py runserver 8001 --noreload --skip-checks
  2. mojo2austin /home/dooferlad/supportsite.austin /home/dooferlad/supportsite.austin-txt

Versions

  • Python 3.11.9
  • austin 3.6.0
  • austin-python==1.7.1

My environment is set up with:

LANGUAGE=en_GB.UTF-8
LANG=en_GB.UTF-8

Additional Information

To get mojo2austin and austin2speedscope to work, I made these changes: In austin/format/mojo.py at line 331:

    def read_string(self) -> str:
        """Read a string from the MOJO file."""
        return self.read_until().decode(encoding="latin-1")

And also austin/stats.py line 419:

    def __enter__(self) -> "AustinFileReader":
        """Open the Austin file and read the metadata."""
        self._stream = open(self.file, encoding="latin-1")

I assume that the string in the Mojo file is from the Python application, but I don't actually know. I am not sure if the above change is actually a fix or just masking the real bug!

dooferlad avatar Sep 24 '24 10:09 dooferlad