ArchiveBot icon indicating copy to clipboard operation
ArchiveBot copied to clipboard

Dashboard WebSocket server crashing with `asyncio.streams.LimitOverrunError`

Open JustAnotherArchivist opened this issue 2 years ago • 1 comments

This crash happened twice today:

Traceback (most recent call last):
  File ".../python3.6/asyncio/streams.py", line 488, in readline
    line = yield from self.readuntil(sep)
  File ".../python3.6/asyncio/streams.py", line 569, in readuntil
    offset)
asyncio.streams.LimitOverrunError: Separator is not found, and chunk exceed the limit

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dashboard/websocket.py", line 85, in <module>
    main()
  File "dashboard/websocket.py", line 81, in main
    loop.run_until_complete(asyncio.gather(stdin_to_amplifier(amplifier, loop), print_status(amplifier)))
  File ".../python3.6/asyncio/base_events.py", line 484, in run_until_complete
    return future.result()
  File "dashboard/websocket.py", line 28, in stdin_to_amplifier
    amplifier.send((await reader.readline()).decode('utf-8').strip())
  File ".../python3.6/asyncio/streams.py", line 497, in readline
    raise ValueError(e.args[0])
ValueError: Separator is not found, and chunk exceed the limit

Sounds like overlong lines from the firehose cause this crash, potentially a job with very long URLs.

JustAnotherArchivist avatar Jan 12 '23 08:01 JustAnotherArchivist

That is indeed what causes these crashes. One job in particular produced lines of up to 1.7 MiB. The buffer is only 1 MiB. The fix here is probably to drop lines that exceed some limit. Whether that should be 1 MiB or larger, I'm not sure, but really that size ought to be sufficient.

JustAnotherArchivist avatar Jan 12 '23 16:01 JustAnotherArchivist