numaflow icon indicating copy to clipboard operation
numaflow copied to clipboard

v1.6.0 Rust dataplane has less performance than Golang one

Open tmenjo opened this issue 5 months ago • 2 comments

Describe the bug

I ran an extended numaflow-dra demo on Numaflow v1.6.0 and found that the new Rust dataplane had less performance than the existing Golang one: 15 fps on the Golang versus 12-13 fps on the Rust.

The demo has a straight pipeline which consists of the following 4 vertices:

  1. In (Source) -- Slice a source video into frames then send each of them
  2. Filter Resize (Map UDF) -- Shrink the frames as pre-process for the inference
  3. Inference (Map UDF) -- Perform object detection on the frames
  4. Out (Sink) -- Log the results of the inference into a text file

The performance bottleneck is on the Filter Resize.

Expected behavior

The rust dataplane has the same performance as the Golang one, or more than.

Screenshots

Here are some Grafana screenshots. Three screenshots for one dataplane: the first is for average latency of each vertex, the second is for "Filter Resize" throughtput, and the third is for "Out" throughtput.

On the Golang dataplane. Approx. 15 fps.

Image Image Image

On the Rust dataplane. 12-13 fps.

Image Image Image

Environment:

  • Kubernetes: 1.33.2
  • NVIDIA DRA driver: v25.3.0-rc.5
  • Numaflow: v1.6.0
  • pynumaflow: 0.9.1

Postscript

I also found that some metrics were missing or unexpected on the Rust dataplane. I will make another issue.


Message from the maintainers:

Impacted by this bug? Give it a 👍. We often sort issues this way to know what to prioritize.

For quick help and support, join our slack channel.

tmenjo avatar Aug 12 '25 05:08 tmenjo

Could you please share the pipeline spec?

yhl25 avatar Aug 13 '25 06:08 yhl25

@yhl25 Here are my specs and their diff.

For Golang dataplane:

apiVersion: numaflow.numaproj.io/v1alpha1
kind: Pipeline
metadata:
  name: dci-poc-step2
spec:
  vertices:
    - name: in
      containerTemplate:
        env:
          - name: NUMAFLOW_RUNTIME
            value: golang
      scale:
        min: 1
        max: 1
      source:
        udsource:
          container:
            image: 192.168.21.17:30003/dci-community-general/menjo-step2:develop
            imagePullPolicy: IfNotPresent
            env:
              - name: SCRIPT
                value: "source"
              - name: VIDEO_SRC
                value: "/mnt/whitebox/video.mp4"
            volumeMounts:
              - mountPath: /var/tmp/logs/numaflow-dra/step2
                name: log-volume
              - mountPath: /mnt/whitebox
                name: whitebox-volume
      limits:
        readBatchSize: 1
      volumes:
        - name: log-volume
          hostPath:
            path: /var/tmp/logs/numaflow-dra/step2
            type: DirectoryOrCreate
        - name: whitebox-volume
          hostPath:
            path: /mnt/whitebox
            type: DirectoryOrCreate
    - name: filter-resize
      containerTemplate:
        env:
          - name: NUMAFLOW_RUNTIME
            value: golang
      scale:
        min: 1
        max: 1
      udf:
        container:
          image: 192.168.21.17:30003/dci-community-general/menjo-step2:develop
          imagePullPolicy: IfNotPresent
          env:
            - name: SCRIPT
              value: "fr-stream"
            - name: OUTPUT_WIDTH
              value: "416"
            - name: OUTPUT_HEIGHT
              value: "416"
          volumeMounts:
            - mountPath: /var/tmp/logs/numaflow-dra/step2
              name: log-volume
      limits:
        readBatchSize: 1
      volumes:
        - name: log-volume
          hostPath:
            path: /var/tmp/logs/numaflow-dra/step2
            type: DirectoryOrCreate
    - name: inference
      containerTemplate:
        env:
          - name: NUMAFLOW_RUNTIME
            value: golang
      scale:
        min: 1
        max: 1
      udf:
        container:
          image: 192.168.21.17:30003/dci-community-general/menjo-step2-gpu:develop
          imagePullPolicy: IfNotPresent
          env:
            - name: SCRIPT
              value: "inf-stream"
          volumeMounts:
            - mountPath: /var/tmp/logs/numaflow-dra/step2
              name: log-volume
          resources:
            claims:
              - name: gpu
      resourceClaims:
       - name: gpu
         resourceClaimTemplateName: numaflow-dra-single-gpu # numaflow-dra/config_yaml/resource-claim-template.yaml
      limits:
        readBatchSize: 1
      volumes:
        - name: log-volume
          hostPath:
            path: /var/tmp/logs/numaflow-dra/step2
            type: DirectoryOrCreate
    - name: out
      containerTemplate:
        env:
          - name: NUMAFLOW_RUNTIME
            value: golang
      scale:
        min: 1
        max: 1
      sink:
        udsink:
          container:
            image: 192.168.21.17:30003/dci-community-general/menjo-step2:develop
            imagePullPolicy: IfNotPresent
            volumeMounts:
              - name: log-volume
                mountPath: /var/tmp/logs/numaflow-dra/step2
            env:
              - name: SCRIPT
                value: "sink"
      volumes:
        - name: log-volume
          hostPath:
            path: /var/tmp/logs/numaflow-dra/step2
            type: DirectoryOrCreate
  edges:
    - from: in
      to: filter-resize
    - from: filter-resize
      to: inference
    - from: inference
      to: out

For Rust dataplane:

apiVersion: numaflow.numaproj.io/v1alpha1
kind: Pipeline
metadata:
  name: dci-poc-step2
spec:
  vertices:
    - name: in
      scale:
        min: 1
        max: 1
      source:
        udsource:
          container:
            image: 192.168.21.17:30003/dci-community-general/menjo-step2:develop
            imagePullPolicy: IfNotPresent
            env:
              - name: SCRIPT
                value: "source"
              - name: VIDEO_SRC
                value: "/mnt/whitebox/video.mp4"
            volumeMounts:
              - mountPath: /var/tmp/logs/numaflow-dra/step2
                name: log-volume
              - mountPath: /mnt/whitebox
                name: whitebox-volume
      limits:
        readBatchSize: 1
      volumes:
        - name: log-volume
          hostPath:
            path: /var/tmp/logs/numaflow-dra/step2
            type: DirectoryOrCreate
        - name: whitebox-volume
          hostPath:
            path: /mnt/whitebox
            type: DirectoryOrCreate
    - name: filter-resize
      scale:
        min: 1
        max: 1
      udf:
        container:
          image: 192.168.21.17:30003/dci-community-general/menjo-step2:develop
          imagePullPolicy: IfNotPresent
          env:
            - name: SCRIPT
              value: "fr-stream"
            - name: OUTPUT_WIDTH
              value: "416"
            - name: OUTPUT_HEIGHT
              value: "416"
          volumeMounts:
            - mountPath: /var/tmp/logs/numaflow-dra/step2
              name: log-volume
      limits:
        readBatchSize: 1
      volumes:
        - name: log-volume
          hostPath:
            path: /var/tmp/logs/numaflow-dra/step2
            type: DirectoryOrCreate
    - name: inference
      scale:
        min: 1
        max: 1
      udf:
        container:
          image: 192.168.21.17:30003/dci-community-general/menjo-step2-gpu:develop
          imagePullPolicy: IfNotPresent
          env:
            - name: SCRIPT
              value: "inf-stream"
          volumeMounts:
            - mountPath: /var/tmp/logs/numaflow-dra/step2
              name: log-volume
          resources:
            claims:
              - name: gpu
      resourceClaims:
       - name: gpu
         resourceClaimTemplateName: numaflow-dra-single-gpu # numaflow-dra/config_yaml/resource-claim-template.yaml
      limits:
        readBatchSize: 1
      volumes:
        - name: log-volume
          hostPath:
            path: /var/tmp/logs/numaflow-dra/step2
            type: DirectoryOrCreate
    - name: out
      scale:
        min: 1
        max: 1
      sink:
        udsink:
          container:
            image: 192.168.21.17:30003/dci-community-general/menjo-step2:develop
            imagePullPolicy: IfNotPresent
            volumeMounts:
              - name: log-volume
                mountPath: /var/tmp/logs/numaflow-dra/step2
            env:
              - name: SCRIPT
                value: "sink"
      volumes:
        - name: log-volume
          hostPath:
            path: /var/tmp/logs/numaflow-dra/step2
            type: DirectoryOrCreate
  edges:
    - from: in
      to: filter-resize
    - from: filter-resize
      to: inference
    - from: inference
      to: out

Diff of the two:

--- pipeline-golang-develop.yaml	2025-08-13 14:44:22.330397458 +0900
+++ pipeline-rust-develop.yaml	2025-08-13 14:44:27.465376835 +0900
@@ -5,10 +5,6 @@
 spec:
   vertices:
     - name: in
-      containerTemplate:
-        env:
-          - name: NUMAFLOW_RUNTIME
-            value: golang
       scale:
         min: 1
         max: 1
@@ -39,10 +35,6 @@
             path: /mnt/whitebox
             type: DirectoryOrCreate
     - name: filter-resize
-      containerTemplate:
-        env:
-          - name: NUMAFLOW_RUNTIME
-            value: golang
       scale:
         min: 1
         max: 1
@@ -68,10 +60,6 @@
             path: /var/tmp/logs/numaflow-dra/step2
             type: DirectoryOrCreate
     - name: inference
-      containerTemplate:
-        env:
-          - name: NUMAFLOW_RUNTIME
-            value: golang
       scale:
         min: 1
         max: 1
@@ -99,10 +87,6 @@
             path: /var/tmp/logs/numaflow-dra/step2
             type: DirectoryOrCreate
     - name: out
-      containerTemplate:
-        env:
-          - name: NUMAFLOW_RUNTIME
-            value: golang
       scale:
         min: 1
         max: 1

tmenjo avatar Aug 13 '25 06:08 tmenjo