azure-functions-java-worker
azure-functions-java-worker copied to clipboard
Investigate throughput issue
Issue: When two HTTP requests are made at the same time to two separate HTTP Trigger Java functions, the requests get handled sequentially in the order of creation, instead of concurrently.
Reproduction Steps:
Using the following two functions:
LongResponse:
package com.fabrikam.functions;
import java.util.*;
import java.util.concurrent.TimeUnit;
import com.microsoft.azure.serverless.functions.annotation.*;
import com.microsoft.azure.serverless.functions.*;
import org.apache.http.HttpResponse;
/**
* Azure Functions with HTTP Trigger.
*/
public class Function {
/**
* This function listens at endpoint "/api/hello". Two ways to invoke it using "curl" command in bash:
* 1. curl -d "HTTP Body" {your host}/api/hello
* 2. curl {your host}/api/hello?name=HTTP%20Query
*/
@FunctionName("LongResponse")
public HttpResponseMessage<String> hello(
@HttpTrigger(name = "req", methods = {"get", "post"}, authLevel = AuthorizationLevel.ANONYMOUS) HttpRequestMessage<Optional<String>> request,
final ExecutionContext context) {
try
{
context.getLogger().info("Java HTTP trigger processed a request.");
TimeUnit.SECONDS.sleep(15);
return request.createResponse(200, "finally returned.");
} catch (Exception ex)
{
return request.createResponse(500, ex.toString());
}
}
}
QuickResponse:
package com.fabrikam.functions;
import java.util.*;
import com.microsoft.azure.serverless.functions.annotation.*;
import com.microsoft.azure.serverless.functions.*;
/**
* Azure Functions with HTTP trigger.
*/
public class Quickresponse {
/**
* This function will listen at HTTP endpoint "/api/Quickresponse". Two approaches to invoke it using "curl" command in bash:
* 1. curl -d "Http Body" {your host}/api/Quickresponse
* 2. curl {your host}/api/Quickresponse?name=HTTP%20Query
*/
@FunctionName("Quickresponse")
public HttpResponseMessage<String> httpHandler(
@HttpTrigger(name = "req", methods = { "get", "post" }, authLevel = AuthorizationLevel.ANONYMOUS) HttpRequestMessage<Optional<String>> request,
final ExecutionContext context
) {
return request.createResponse(200, "Hello");
}
}
When calling LongResponse and QuickResponse in quick succession (first LongResponse, then QuickResponse), we would expect LongResponse to finish in ~15 seconds and QuickResponse to finish in ~200ms.
The actual behavior is that QuickResponse is finishing after LongResponse finishes, so LongResponse finishes after ~15 seconds, and QuickResponse finishes after ~15.2 seconds.
Worker will use the default WorkStealingPool
to process requests. So the parallelism level depends on your physical CPU count.
This behavior is being experienced on hosting in Azure (I have not tested locally yet actually). Wouldn't we expect this to scale up to ~200 instances?
Probably because Azure environment is using a single-processor machine to run your function. The worker need to replace the WorkStealingPool
with some other thread pool strategy. I will discuss with functions host (@pragnagopa ) to see which strategy (fixed thread pool, unlimited thread pool, processor_count * n, or other) they prefer.
cc: @fabiocav @JunyiYi - Are are you blocked on this?
@pragnagopa , not blocking. But we could discuss it offline to choose the most suitable multi-threading strategy.
@pragnagopa , when you decided which threading strategy (now I'm using the thread pool which contains Physical CPU Count threads, which may not handle a lot of long running IO-intense tasks, which may also not be suitable for small Azure machines with only 1 CPU) to use, you can just update this line to update. And here are the candidates.
@JonathanGiles fyi
@brunoborges / @JonathanGiles - Do you have any recommendation on which thread pooling strategy to use from the ones lists here
Changing the Executor to newCachedThreadPool helped a little but no significant improvement. RPS went from 60. Let me know if you have any other ideas
@pragnagopa can we decompose this into smaller issues we can track in sprints?
cc @FabianMeiswinkel
Closing this issue as I verified concurrent requests with different delays works as expected. Java worker has been improved with several changes. Please reopen if this issue persists and we can help you.