serving Only one pod will be created when send multiple requests

What version of Knative?

1.13

Expected Behavior

I follow the knative docs to build helloworld image and create service with following config.

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-java-spring
  namespace: default
spec:
  template:
    annotations:
      autoscaling.knative.dev/target: "1"
      autoscaling.knative.dev/target-utilization-percentage: "100"
      autoscaling.knative.dev/target-burst-capacity: "1"
      autoscaling.knative.dev/metric: "concurrency"
    spec:
      containerCounrrency: 1
      containers:
        - image: docker.io/xxx/helloworld-java-spring:v2
          ImagePullPolicy: IfNotPresent
          env:
            - name: TARGET
              value: "Spring Boot Sample v1"

I use following script to invoke.

#! /bin/bash

set -ex

kn service apply -f service.yaml
sleep 5

export APP=$(kubectl get service.serving.knative.dev/helloworld-java-spring | grep http | awk '{print $2}')

echo "Wait for pods to be terminated"
while [ $(kubectl get pods 2>/dev/null | wc -l) -ne 0 ];
do
  sleep 5;
done

echo "hit the autoscaler with burst of requests"
for i in `seq 7`; do
    curl -s "$APP" 1>/dev/null &
done

echo "wait for the autoscaler to kick in and the bursty requests to finish"
sleep 30

echo "send longer requets"
for i in `seq 5`; do
    curl "$APP"&
    sleep 1;
done

Five helloworld should be reponsed and five pods being created.

Actual Behavior

I get five helloworld but only one pod being created.

Steps to Reproduce the Problem

Install kubernetes v1.29 with containerd. Then install calico.yaml,metallb.yaml. Use istioctl v1.20.2 to install istio operator. Install serving-crds.yaml and serving-core.yaml. Install dns and istio. Then follow knative docs to create service and invoke with aforementioned bash script.

May 28 '24 15:05 huasiy

Hi, I'd experienced similar issue because of 'Too much short processing time". When i just return "hello world" in simple python code, than 10000 concurrency could processed in one pod. I recommend you to put some thread-stop code like Thread.stop(), than it will scale-out.

Jun 26 '24 06:06 vividcloudpark

This is one reason. I find when knative activator use clusterip to proxy requests, kubernetes service doesn't evenly distribute proxy requests. So I let activator use pod ip in addition to process time.

Jun 28 '24 09:06 huasiy

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

Sep 28 '24 01:09 github-actions[bot]