clusterplex icon indicating copy to clipboard operation
clusterplex copied to clipboard

Per Worker Stream Limits

Open AngellusMortis opened this issue 2 years ago • 29 comments

Is your feature request related to a problem? Please describe.

No.

Describe the solution you'd like

It would be great to be able to set per worker limits via an environment variable or something. This is great to have if node have varying number of resources and are not identical hardware.

Describe alternatives you've considered None of the existing worker selection strategies quite fit the use case of nodes with different hardware. All of the existing methods assume you are always using CPU transcodes instead of hardware transcodes (my k8s cluster and my existing Plex server all have GPUs and Plex primarily uses direct play / hardware transcode).

AngellusMortis avatar Jul 19 '22 00:07 AngellusMortis

That's a great idea! Yeah, the strategies were initially focused on CPU-bound workers, since that's what I had to work with at the time :). So in your case, your workers also have a variety of GPUs and capabilities? I'm trying to think of what an adequate strategy would look like that could accomodate for non-uniform workers. I take it that LOAD_TASKS strategy wouldn't be adequate as that would distribute it them proportionally across all of them, when one worker might be twice as powerful as the other. Were you looking for something like hard limits on simultaneous transcoding jobs and node priorities, in order to fill up the highest priorities first and then move on to the next nodes or maybe something like weighted round-robin where one node might get 70% of jobs and another might get 30%? I'm just thinking off the top of might head the options we might have to see which one is easier to set up and gives useful results.

pabloromeo avatar Jul 19 '22 21:07 pabloromeo

So basically I have my existing Plex server, which has way more CPU power and a Quadro P2200. Then I just finished building a kubernetes cluster that has less CPU power per node and a Quadro P1000 for each node. The P1000 is about half as power as the P2200. If/when I set this up, I will probably have Plex in "remote only" mode in my k8s cluster, and then 1 worker per k8s cluster and 1 worker on my old/existing Plex server.

I would imagine being able to set absolute limits on transcodes just like you can in Plex (or like you can in Tdarr). For both CPU and GPU. Have a setting to prioritize GPU transcode over CPU transcode (or vice versa for some reason). Then scheduling would basically just be based on whichever server has the "most free capacity" starting with GPU first and going to CPU if the GPU slows run out.

I do think it might be cool to get a float/weighted count of how "expensive" the job is. But that is probably for a completely other issue. As an example, a 720p -> 720p would be "1 transcode", a 1080p -> 1080p transcode would be "2 transcodes", a 4K -> 1080p transcode would be "4 transcodes".

AngellusMortis avatar Jul 19 '22 21:07 AngellusMortis

As a side note, it would be amazing if this project and Tdarr collaborated. I feel like there is a lot of overlap here. Like make it so the Orchestrator could connect to Tdarr and use Tdarr nodes for transcoding. It could even handle all of the limits and everything potentially since Tdarr can already control the limits.

I think a lot of folks with large enough Plex instance to look into this project may already be running Tdarr to process 4K movies.

AngellusMortis avatar Jul 19 '22 21:07 AngellusMortis

I take it that LOAD_TASKS strategy wouldn't be adequate as that would distribute it them proportionally across all of them, when one worker might be twice as powerful as the other.

In theory yes it could, but it is hard to know the limits for GPUs. The Quadros/professional GPUs have unlimited transcodes so you could just use like the % used, but for consumer/GTX GPUs, they have built in limits unless you flash a custom firmware.

The limits also assume the whole system is dedicated to being a transcode node. For my k8s clusters, I would love to dedicate the whole GPU to Plex but then have it not touch the CPU at all.

AngellusMortis avatar Jul 19 '22 21:07 AngellusMortis

Yeah, I don't know if there's a way from node.js to determine the load the GPU is currently under. Hard to see how Tdarr does it since it's not open-source. I'll keep looking around to see if I find any way to determine GPU load programmatically. Might be useful for an additional selection strategy.

pabloromeo avatar Jul 21 '22 17:07 pabloromeo

I do think it might be cool to get a float/weighted count of how "expensive" the job is. But that is probably for a completely other issue. As an example, a 720p -> 720p would be "1 transcode", a 1080p -> 1080p transcode would be "2 transcodes", a 4K -> 1080p transcode would be "4 transcodes".

Along these lines, I'd like to see if we can have multiple worker pools. My k8s cluster has a mix of 6th and 8th gen Intel's. I believe transcoding from <=1080p is okay on the 6th gen, but need to do 7th gen for anything 4k and/or HDR. I'm not sure what all comes in to the orchestrator on the "payload" when a transcode request comes in to know if it's feasible to implement a worker selection process based on the original media type.

ammmze avatar Aug 19 '22 04:08 ammmze

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Sep 19 '22 04:09 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Oct 20 '22 03:10 github-actions[bot]

You could call nvidia-smi, rocm-smi, and intel-gpu-top to get load

nvidia-smi -q -x #query, xml output
rocm-smi --json
intel-gpu-top -J #json output 

kageurufu avatar Nov 13 '22 17:11 kageurufu

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Dec 14 '22 02:12 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Jan 14 '23 02:01 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Feb 14 '23 02:02 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Mar 01 '23 02:03 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Apr 02 '23 02:04 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar May 13 '23 02:05 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar May 27 '23 02:05 github-actions[bot]

I would love to be able to set a priority too,

I have multiple workers on the one machine, one with access to nvidia transcoding, and one without.

I'd like transcoding to happen on the nvidia always, unless that worker is not available, in which case I'd like it to use the CPU worker. Sometimes the nvidia is offline due to being used in a different VM, but with the current system it will sometimes do CPU rendering even when the nvidia is available.

Great work on this project btw, I love it!

noaho avatar Jun 19 '23 09:06 noaho

Good feedback @noaho, yeah, that's another valid scenario where something like priorities, max-jobs, custom weights, and and the such can help schedule jobs better. I just haven't had time unfortunately to tackle this, which is why I keep reopening it, to make sure it is not forgotten.

pabloromeo avatar Jun 20 '23 23:06 pabloromeo

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Jul 21 '23 02:07 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Aug 27 '23 02:08 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Sep 10 '23 02:09 github-actions[bot]

GPU Unsure how you might integrate this into node otoh, but might be useful info?

For Intel you can use intel_gpu_top as a helper. Needs to have setcap cap_perfmon=+ep on it ... or accessing linux kernels internal perf tools directly with that app having perfmon.

duskmoss avatar Sep 22 '23 03:09 duskmoss

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Oct 23 '23 02:10 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Nov 06 '23 02:11 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Dec 07 '23 02:12 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Dec 22 '23 02:12 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Jan 24 '24 02:01 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Feb 08 '24 02:02 github-actions[bot]

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Mar 10 '24 02:03 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Mar 24 '24 02:03 github-actions[bot]