clusterplex
clusterplex copied to clipboard
Per Worker Stream Limits
Is your feature request related to a problem? Please describe.
No.
Describe the solution you'd like
It would be great to be able to set per worker limits via an environment variable or something. This is great to have if node have varying number of resources and are not identical hardware.
Describe alternatives you've considered None of the existing worker selection strategies quite fit the use case of nodes with different hardware. All of the existing methods assume you are always using CPU transcodes instead of hardware transcodes (my k8s cluster and my existing Plex server all have GPUs and Plex primarily uses direct play / hardware transcode).
That's a great idea! Yeah, the strategies were initially focused on CPU-bound workers, since that's what I had to work with at the time :). So in your case, your workers also have a variety of GPUs and capabilities? I'm trying to think of what an adequate strategy would look like that could accomodate for non-uniform workers. I take it that LOAD_TASKS strategy wouldn't be adequate as that would distribute it them proportionally across all of them, when one worker might be twice as powerful as the other. Were you looking for something like hard limits on simultaneous transcoding jobs and node priorities, in order to fill up the highest priorities first and then move on to the next nodes or maybe something like weighted round-robin where one node might get 70% of jobs and another might get 30%? I'm just thinking off the top of might head the options we might have to see which one is easier to set up and gives useful results.
So basically I have my existing Plex server, which has way more CPU power and a Quadro P2200. Then I just finished building a kubernetes cluster that has less CPU power per node and a Quadro P1000 for each node. The P1000 is about half as power as the P2200. If/when I set this up, I will probably have Plex in "remote only" mode in my k8s cluster, and then 1 worker per k8s cluster and 1 worker on my old/existing Plex server.
I would imagine being able to set absolute limits on transcodes just like you can in Plex (or like you can in Tdarr). For both CPU and GPU. Have a setting to prioritize GPU transcode over CPU transcode (or vice versa for some reason). Then scheduling would basically just be based on whichever server has the "most free capacity" starting with GPU first and going to CPU if the GPU slows run out.
I do think it might be cool to get a float/weighted count of how "expensive" the job is. But that is probably for a completely other issue. As an example, a 720p -> 720p would be "1 transcode", a 1080p -> 1080p transcode would be "2 transcodes", a 4K -> 1080p transcode would be "4 transcodes".
As a side note, it would be amazing if this project and Tdarr collaborated. I feel like there is a lot of overlap here. Like make it so the Orchestrator could connect to Tdarr and use Tdarr nodes for transcoding. It could even handle all of the limits and everything potentially since Tdarr can already control the limits.
I think a lot of folks with large enough Plex instance to look into this project may already be running Tdarr to process 4K movies.
I take it that LOAD_TASKS strategy wouldn't be adequate as that would distribute it them proportionally across all of them, when one worker might be twice as powerful as the other.
In theory yes it could, but it is hard to know the limits for GPUs. The Quadros/professional GPUs have unlimited transcodes so you could just use like the % used, but for consumer/GTX GPUs, they have built in limits unless you flash a custom firmware.
The limits also assume the whole system is dedicated to being a transcode node. For my k8s clusters, I would love to dedicate the whole GPU to Plex but then have it not touch the CPU at all.
Yeah, I don't know if there's a way from node.js to determine the load the GPU is currently under. Hard to see how Tdarr does it since it's not open-source. I'll keep looking around to see if I find any way to determine GPU load programmatically. Might be useful for an additional selection strategy.
I do think it might be cool to get a float/weighted count of how "expensive" the job is. But that is probably for a completely other issue. As an example, a 720p -> 720p would be "1 transcode", a 1080p -> 1080p transcode would be "2 transcodes", a 4K -> 1080p transcode would be "4 transcodes".
Along these lines, I'd like to see if we can have multiple worker pools. My k8s cluster has a mix of 6th and 8th gen Intel's. I believe transcoding from <=1080p is okay on the 6th gen, but need to do 7th gen for anything 4k and/or HDR. I'm not sure what all comes in to the orchestrator on the "payload" when a transcode request comes in to know if it's feasible to implement a worker selection process based on the original media type.
This issue is stale because it has been open for 30 days with no activity.
This issue is stale because it has been open for 30 days with no activity.
You could call nvidia-smi, rocm-smi, and intel-gpu-top to get load
nvidia-smi -q -x #query, xml output
rocm-smi --json
intel-gpu-top -J #json output
This issue is stale because it has been open for 30 days with no activity.
This issue is stale because it has been open for 30 days with no activity.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue is stale because it has been open for 30 days with no activity.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
I would love to be able to set a priority too,
I have multiple workers on the one machine, one with access to nvidia transcoding, and one without.
I'd like transcoding to happen on the nvidia always, unless that worker is not available, in which case I'd like it to use the CPU worker. Sometimes the nvidia is offline due to being used in a different VM, but with the current system it will sometimes do CPU rendering even when the nvidia is available.
Great work on this project btw, I love it!
Good feedback @noaho, yeah, that's another valid scenario where something like priorities, max-jobs, custom weights, and and the such can help schedule jobs better. I just haven't had time unfortunately to tackle this, which is why I keep reopening it, to make sure it is not forgotten.
This issue is stale because it has been open for 30 days with no activity.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
GPU Unsure how you might integrate this into node otoh, but might be useful info?
For Intel you can use intel_gpu_top as a helper. Needs to have setcap cap_perfmon=+ep on it ... or accessing linux kernels internal perf tools directly with that app having perfmon.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.