claude-agent-sdk-python Feature: Max turns before compact

I've seen --max-turns but really that's not what I want, I want to be able to control the autoCompact so that I can set how many messages are loaded into the history before compacting. I don't need it to reach 95% of the context before compacting.

Today I noticed the response from one of my sessions: num_turns=231

[CLAUDE RAW MESSAGE] ResultMessage(
  subtype='success',
  duration_ms=213207,
  duration_api_ms=162411,
  is_error=False,
  num_turns=231,
  total_cost_usd=0.6639560999999999,
  usage={
    input_tokens: 156,
    cache_creation_input_tokens: 55539,
    cache_read_input_tokens: 679910,
    output_tokens: 2136,
    server_tool_use: {
        web_search_requests: 0
    },
    service_tier: 'standard'
  },
  result='No response requested.'
)

I was surprised and that is far more than I had expected. Unfortunately, I have no insight into what point autocompacts happened. It would be great to be able to control this to reduce the resources used and speed up responses.

Jul 29 '25 15:07 gerrywastaken

It looks like a similar/related feature was asked for on the claude-code repo: https://github.com/anthropics/claude-code/issues/3351

Jul 29 '25 16:07 gerrywastaken

Fllow

Jul 31 '25 11:07 dimsky