spring-ai icon indicating copy to clipboard operation
spring-ai copied to clipboard

Provide the ability to configure client timeouts

Open thesurlydev opened this issue 1 year ago • 8 comments

Expected Behavior

Spring properties are exposed to control timeouts used by the clients.

Current Behavior

There's no documentation or properties to control things like connect, read, and write timeouts.

Context

The default timeout configuration results in some requests timing out. Here's an example:

org.springframework.web.client.ResourceAccessException: I/O error on POST request for "https://api.openai.com/v1/chat/completions": timeout
	at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.createResourceAccessException(DefaultRestClient.java:549) ~[spring-web-6.1.3.jar:6.1.3]
	at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchangeInternal(DefaultRestClient.java:474) ~[spring-web-6.1.3.jar:6.1.3]
	at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.retrieve(DefaultRestClient.java:439) ~[spring-web-6.1.3.jar:6.1.3]
	at org.springframework.ai.openai.api.OpenAiApi.chatCompletionEntity(OpenAiApi.java:655) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
	at org.springframework.ai.openai.OpenAiChatClient.chatCompletionWithTools(OpenAiChatClient.java:337) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
	at org.springframework.ai.openai.OpenAiChatClient.lambda$call$1(OpenAiChatClient.java:151) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
	at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:335) ~[spring-retry-2.0.5.jar:na]
	at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:211) ~[spring-retry-2.0.5.jar:na]
	at org.springframework.ai.openai.OpenAiChatClient.call(OpenAiChatClient.java:147) ~[spring-ai-openai-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
...
Caused by: java.net.SocketTimeoutException: timeout
	at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.kt:675) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.kt:684) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.http2.Http2Stream.takeHeaders(Http2Stream.kt:143) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.http2.Http2ExchangeCodec.readResponseHeaders(Http2ExchangeCodec.kt:97) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:110) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:93) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201) ~[okhttp-4.12.0.jar:na]
	at okhttp3.internal.connection.RealCall.execute(RealCall.kt:154) ~[okhttp-4.12.0.jar:na]
	at org.springframework.http.client.OkHttp3ClientHttpRequest.executeInternal(OkHttp3ClientHttpRequest.java:95) ~[spring-web-6.1.3.jar:6.1.3]
	at org.springframework.http.client.AbstractStreamingClientHttpRequest.executeInternal(AbstractStreamingClientHttpRequest.java:70) ~[spring-web-6.1.3.jar:6.1.3]
	at org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:66) ~[spring-web-6.1.3.jar:6.1.3]
	at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchangeInternal(DefaultRestClient.java:468) ~[spring-web-6.1.3.jar:6.1.3]

thesurlydev avatar Feb 22 '24 19:02 thesurlydev

Quick temporary workaround:

        ClientHttpRequestFactorySettings requestFactorySettings = new ClientHttpRequestFactorySettings(
                connectTimeout , readTimeout , SslBundle.of(null));
        ClientHttpRequestFactory requestFactory = ClientHttpRequestFactories.get(requestFactorySettings);
        Builder clientBuilder = RestClient.builder().requestFactory(requestFactory);
        OpenAiApi openAiApi = new OpenAiApi("https://api.openai.com", openAiToken, clientBuilder);
        openAiChatClient = new OpenAiChatClient(openAiApi);

thingersoft avatar Feb 25 '24 20:02 thingersoft

you can configure timeout settings easier with RestClientCustomizer.

	@Bean
	public RestClientCustomizer restClientCustomizer() {
		return restClientBuilder -> restClientBuilder
				.requestFactory(ClientHttpRequestFactories.get(ClientHttpRequestFactorySettings.DEFAULTS
						.withConnectTimeout(Duration.ofSeconds(1))
						.withReadTimeout(Duration.ofSeconds(2))));
	}

This works not only for OpenAI but also for clients that use RestClient under the hood.

making avatar Feb 27 '24 02:02 making

@making Acting on client at configuration level would affect the context-provided RestClientBuilder. So customization would apply wherever you injected a RestClientBuilder in your application and for every spring boot starter using it. That's why I proposed a programmatic approach as a workaround.

With bigger models like GPT-4 response to non-streamed requests often takes more than 1 minute to come. So I think that read timeout deserves a dedicated configuration property, avoiding the need for workarounds.

Anyway any comments or criticism is appreciated.

thingersoft avatar Feb 27 '24 05:02 thingersoft

@thingersoft Kind of agree. In the app side, timeouts can also be overridden via the builder. Timeouts are a concern not only for OpenAI but also across clients, so if the dedicated property is created at Spring AI level, I would expect that the design be common to all clients.

making avatar Feb 27 '24 06:02 making

@making Indeeed my first thought was to introduce the new property at an higher level.

Unfortunately the current codebase lacks infrastructure to define properties commons to all ModelClients. A few architectural changes was needed but being my first contribution to the project I didn't want to be too intrusive, so I falled back to an OpenAI dedicated property.

Anyway if a reviewer tells me it's ok I could open a new pull request with a more generic approach.

thingersoft avatar Feb 27 '24 19:02 thingersoft

We are looking into a more generic approach, perhaps based on the generic model, perhaps based on aspects.

0.9 is when I thought would be a good point to tackle this problem. In the meantime, we can expose a few retry options for what we have now as an initial step. spring.ai.openai.chat.retry.XYZ property, where the @ConfigurationProperty for retry would be reusable across multiple implementation eventually.

Also, in the meantime we can provide an additional constructor that takes a RetryTemplate and in the autoconfiguraiton we would pass in a default implementation if one wasn't provided in the application context already.

markpollack avatar Feb 29 '24 16:02 markpollack

@markpollack Hello Mark, I may be missing something here since I didn't use Spring Retry before but I can't see how retry options and RetryTemplate in particular relates to client read timeout configuration.

thingersoft avatar Feb 29 '24 19:02 thingersoft

@markpollack Hello Mark, I may be missing something here since I didn't use Spring Retry before but I can't see how retry options and RetryTemplate in particular relates to client read timeout configuration.

@markpollack I would be happy to contribute if you can tell me something about my quoted remark.

thingersoft avatar Mar 06 '24 16:03 thingersoft

Getting back to these issues now. Just saying that retry is in the same family of concerns around connectivty.

markpollack avatar Jul 22 '24 14:07 markpollack

Closing in favor of issue #512

I'd like to think that there is a general solution for this not related to AI. For example, an app is supposed to be able to talk to say 10 microservices, each with different timeouts.

markpollack avatar Jul 22 '24 14:07 markpollack