langextract chunking does not recover from 503 error "The model is overloaded"

Open ianstorrs-cd opened this issue 3 months ago • 1 comments

Gemini is frequently throwing 503 errors "The model is overloaded" - I have purposely broken the langextract task down to chunks of 4000. Presumably, the 503 error happened for one chunk. In this scenario, langextract should internally retry that chunk and continue processing the document. It makes no sense to abandon all of the work that has been done because of a load issue. The necessity to redo entire documents eats up quotas and causes 429 errors.

Sep 12 '25 15:09 ianstorrs-cd

Thanks for sharing this @ianstorrs-cd — This is definitely a priority for improvement and should be fixed soon.

Sep 16 '25 03:09 aksg87