clusterplex icon indicating copy to clipboard operation
clusterplex copied to clipboard

TypeError crash on worker when attempting to transcode

Open albertsj1 opened this issue 1 year ago • 4 comments

Firstly, thank you for your work on this project. I truly appreciate the time and effort you've put into this to offer it publicly for free.

I am trying to get this working on Docker Swarm. I have 5 node Raspberry Pi cluster with latest version of dietPi. Each node has 4G of memory.

Some backstory of an issue I fixed yesterday (in case it's part of the problem) Issue #317 was fixed yesterday and that fixed the missing CLUSTERPLEX_PLEX_CODECS_VERSION; however, I was still getting an error when it tried to download any codecs. The error appeared to be that the wget command to download the codecs was failing because it appeared that the variables were not being interpreted properly.

Codec libzerocodec_decoder.so does not exist. Downloading...
/usr/lib/plexmediaserver/Plex Media Server: line 111:   564 Bus error               wget https://downloads.plex.tv/codecs/${CLUSTERPLEX_PLEX_CODECS_VERSION}/${CLUSTERPLEX_PLEX_CODEC_ARCH}/${codec}.so
Codec libzlib_decoder.so does not exist. Downloading...
/usr/lib/plexmediaserver/Plex Media Server: line 111:   565 Segmentation fault      wget https://downloads.plex.tv/codecs/${CLUSTERPLEX_PLEX_CODECS_VERSION}/${CLUSTERPLEX_PLEX_CODEC_ARCH}/${codec}.so
Codec libzmbv_decoder.so does not exist. Downloading...
/usr/lib/plexmediaserver/Plex Media Server: line 111:   566 Segmentation fault      (core dumped) wget https://downloads.plex.tv/codecs/${CLUSTERPLEX_PLEX_CODECS_VERSION}/${CLUSTERPLEX_PLEX_CODEC_ARCH}/${codec}.so

I did a docker exec -it bash into one of the worker containers. I made a copy of start.sh to tmp_start.sh with the last line removed so it didn't start the app again. I then ran ./tmp_start.sh. It successfully downloaded all of the codecs without error and exited successfully. After that, I re-deployed the plex stack. The workers detected the codecs already existed and appeared to be ready without error for jobs.

Now... the current problem. Any time I try to watch a video, the transcode job is sent to the worker and the worker crashes with the following error:

Received task request
Setting hwaccel to mmal
EAE_ROOT => "/tmp/pms-3a9dbb6b-c249-4e68-bb49-206e1342974d/EasyAudioEncoder"
EAE Support - Spawning EasyAudioEncoder from "/codecs/ad47460-ffe81d9cd51bd27cb3fbbe09-linux-aarch64-standard/EasyAudioEncoder/EasyAudioEncoder/EasyAudioEncoder", cwd => /tmp/pms-3a9dbb6b-c249-4e68-bb49-206e1342974d/EasyAudioEncoder
/app/worker.js:156
				createEAE_PID(childEAE.pid.toString());
				                           ^

TypeError: Cannot read properties of undefined (reading 'toString')
    at Socket.<anonymous> (/app/worker.js:156:32)
    at Emitter.emit (/app/node_modules/@socket.io/component-emitter/index.js:143:20)
    at Socket.emitEvent (/app/node_modules/socket.io-client/build/cjs/socket.js:559:20)
    at Socket.onevent (/app/node_modules/socket.io-client/build/cjs/socket.js:546:18)
    at Socket.onpacket (/app/node_modules/socket.io-client/build/cjs/socket.js:514:22)
    at Emitter.emit (/app/node_modules/@socket.io/component-emitter/index.js:143:20)
    at /app/node_modules/socket.io-client/build/cjs/manager.js:237:18
    at process.processTicksAndRejections (node:internal/process/task_queues:81:21)

Node.js v20.15.0

NOTE: I have hwaccel set to mmal; however, I get the exact same error without it.

The permissions of the video the worker was trying to play as seen from inside the worker node:

-rwxr-xr-x 1 1000 65534 5.2G Sep 25  2023 '/data/media/tv_shows/<redacted>/Season 1/<redacted>.mkv'

My full docker-swarm.yaml file.

albertsj1 avatar Jun 22 '24 15:06 albertsj1

Sounds like both issues may be related (the codecs download and the EAE pid). Any chance you could try one of those Pi's with the official raspberry pi OS 64-bit?

pabloromeo avatar Jun 22 '24 20:06 pabloromeo

I'll give it a shot and respond back. Probably won't get a chance to do that for a couple of days.

albertsj1 avatar Jun 22 '24 21:06 albertsj1

No worries. In a few days I'll try to run dietpi on a vm in proxmox to see if I can reproduce the issue, too. It sounds like it might be a networking issue, or a permissions issue, or both. Not sure which yet.

pabloromeo avatar Jun 23 '24 22:06 pabloromeo

Finally had a chance to run this on a new raspberry pi install. This is on the same raspberry pi 4. I just put a new OS install on a usb drive and booted with the new drive.

I have the exact same error messages and issue.

# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
# docker version
Client: Docker Engine - Community
 Version:           27.0.3
 API version:       1.46
 Go version:        go1.21.11
 Git commit:        7d4bcd8
 Built:             Sat Jun 29 00:02:44 2024
 OS/Arch:           linux/arm64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          27.0.3
  API version:      1.46 (minimum version 1.24)
  Go version:       go1.21.11
  Git commit:       662f78c
  Built:            Sat Jun 29 00:02:44 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.7.18
  GitCommit:        ae71819c4f5e67bb4d5ae76a6b735f29cc25774e
 runc:
  Version:          1.7.18
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Here is a clip from the startup of the worker:

**** Move shim to destination ****
**** Make the shim executable ****
**** Setting up codecs directory ****
Directory already present
**** Changing ownership for /codecs ****
Temporarily starting Plex Media Server.
Waiting for Plex to generate its config
CLUSTERPLEX_PLEX_VERSION => '1.40.3.8555-fef15d30c'
CLUSTERPLEX_PLEX_CODECS_VERSION => 'ad47460-ffe81d9cd51bd27cb3fbbe09'
CLUSTERPLEX_PLEX_EAE_VERSION (extracted) => 'eae--42'
PLEX_ARCH => 'arm64'
EAE_VERSION => '2001'
CLUSTERPLEX_PLEX_CODEC_ARCH => linux-aarch64-standard
Codec location => /codecs/ad47460-ffe81d9cd51bd27cb3fbbe09-linux-aarch64-standard
Found EAE_VERSION.txt => 2001
EAE is up to date

albertsj1 avatar Jul 01 '24 16:07 albertsj1

I was playing with this some more this morning. I deleted the contents of the codecs directory so the new container could fetch them again and it's still having trouble fetching the codecs as mentioned in the first post. So I'm guessing the root cause of the problem with the worker crashing is whatever is also causing the failure to download the codecs.

albertsj1 avatar Jul 04 '24 14:07 albertsj1

I had similar problems in the past. When testing and analysing the error, I also noticed that the codec download was not working. I then switched to the ClusterPlex images and it has worked ever since.

Azathoth88 avatar Jul 23 '24 19:07 Azathoth88

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Aug 23 '24 02:08 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Sep 06 '24 02:09 github-actions[bot]