Plugin support for custom LLM providers now available
Hi all,
LangExtract now supports third-party model providers through a new plugin and registry infrastructure. You can integrate custom LLM backends (Azure OpenAI, AWS Bedrock, custom inference servers, etc.) without modifying core LangExtract code.
Please check out the example and documentation: https://github.com/google/langextract/tree/main/examples/custom_provider_plugin
List of providers available: https://github.com/google/langextract/blob/main/COMMUNITY_PROVIDERS.md
Feel free to provide feedback or report any issues. Thank you!
@YunfanGoForIt @mariano @zc1175 @praneeth999 @JustStas @jkitaok
@aksg87 thanks for the feature! Do you think it would make sense to include some of the more popular providers (azure, bedrock) as core providers? Happy to provide a PR in the new format. Especially considering Azure OpenAI will be quite similar to the already supported OpenAI
@aksg87 I have updated #38 with the new approach and switched to inheriting the bulk of the functionality from the openai.py provider. I think this way it doesn't generate much additional effort for long-term support while covering a recurrent need for many users (especially in corporate projects).
Otherwise, we could also add a LiteLLM core provider to cover many LLMs with minimum long-term support overhead - please tell me if you think this makes sense.
@aksg87 Looks great. Question, currently extract() builds the model config and instance from params. Shouldn't it also be possible to pass either ModelConfig or a factory-instantiated model? So instead of:
result = extract(
prompt_description=settings.specs.header.prompt,
examples=settings.specs.header.examples,
text_or_documents=content,
language_model_type=OpenAILanguageModel,
language_model_params={
"base_url": "http://localhost:9001/v1",
"api_key": environ["OPENAI_API_KEY"]
},
model_id="gpt-4o-mini",
use_schema_constraints=False,
fence_output=True,
debug=False
)
We could do:
model_config = ModelConfig(
model_id="gpt-4o-mini",
provider=OpenAILanguageModel,
provider_kwargs={
"base_url": "http://localhost:9001/v1",
"api_key": environ["OPENAI_API_KEY"]
}
)
result = extract(
prompt_description=settings.specs.header.prompt,
examples=settings.specs.header.examples,
text_or_documents=content,
config=model_config,
use_schema_constraints=False,
fence_output=True,
debug=False
)
Or even:
model_config = ModelConfig(
model_id="gpt-4o-mini",
provider=OpenAILanguageModel,
provider_kwargs={
"base_url": "http://localhost:9001/v1",
"api_key": environ["OPENAI_API_KEY"]
}
)
model = factory.create_model(model_config)
result = extract(
prompt_description=settings.specs.header.prompt,
examples=settings.specs.header.examples,
text_or_documents=content,
model=model,
use_schema_constraints=False,
fence_output=True,
debug=False
)
Thoughts? I can create a PR for this if it sounds good.
Hey @JustStas, thanks for the PR offer!
Given that we just landed a provider registry + plugin system (#97), I think the ideal path for Azure OpenAI and Bedrock would be as external provider plugins. We actually added OpenAI as a stopgap given high demand, but even that might move to a plugin eventually - right now I want to keep the core focused on extraction features rather than managing the provider ecosystem.
Would you be up for creating langextract-azure-openai and/or langextract-bedrock as separate packages? Others like LiteLLM would also work great as a plugin.
Once you publish and a few folks validate it works well, we'll add a community providers section to feature it. This way the community can iterate quickly on provider support while we focus on improving the core extraction capabilities.
@mariano great idea on the config and model parameters - moved it to #106!
I would like to confirm the issue regarding whether there are bugs in the custom LLM. First, I executed it by customizing the LLM class method and found that there were bugs. So, I downloaded the source code again and executed it with the source code in the way of https://github.com/google/langextract/tree/main/examples/custom_provider_plugin, but still got an error.
python .\test_example_provider.py
Traceback (most recent call last):
File "C:\Users\Administrator\Desktop\langextract-main\examples\custom_provider_plugin\test_example_provider.py", line 22, in
===========================================================================
Meanwhile, he has other mistakes. For example, the following code import langextract as lx def main(): """Test the custom provider.""" api_key = "eesd" config = lx.factory.ModelConfig(model_id="gemini-2.5-flash",provider="CustomGeminiProvider",provider_kwargs={"api_key": api_key},) if name == "main": main()
===========================================================================
It's just these few lines of code, and it still reports an error.
python .\test_example_provider.py
Traceback (most recent call last):
File "C:\Users\Administrator\Desktop\langextract-main\examples\custom_provider_plugin\test_example_provider.py", line 9, in
============================================================================ This is the information of langextract
pip show langextract Name: langextract Version: 1.0.5 Summary: LangExtract: A library for extracting structured data from language models Home-page: https://github.com/google/langextract Author: Author-email: Akshay Goel [email protected] License-Expression: Apache-2.0 Location: D:\ProgramData\miniconda3\envs\demo\Lib\site-packages Requires: absl-py, aiohttp, async_timeout, exceptiongroup, google-genai, ml-collections, more-itertools, numpy, openai, pandas, pydantic, python-dotenv, PyYAML, requests, tqdm, typing-extensions Required-by:
Update: The plugin system is now in main but not yet on PyPI (v1.0.5 was released ~10 hours before the PR merged).
To use it now, install from source:
git clone https://github.com/google/langextract.git
cd langextract
pip install -e .
Then check out examples/custom_provider_plugin/ for the implementation pattern. The next PyPI release will include the plugin system.
Accidentally removed a reference to a related plugin request. Reposting:
https://github.com/google/langextract/issues/109
Hi @JustStas, any interest in converting your PR into a plugin? I wanted to start a page on model plugins by the community so if you do, please let me know. Thanks!
@aksg87 sure, happy to build the plugin. Probably makes sense to implement both openai and azure openai in the same plugin (and later deprecated the built-in openai module)?
Hi @JustStas, that sounds great! Plugins are very flexible so you're welcome to implement what you think fits best. When a plugin becomes popular, we can add it to a central reference area.
Also, I just added support for plugins to define their own schemas for controlled generation (see #130), so you should be able to implement nearly the entire surface area of the model-to-LangExtract interaction.
I'd really appreciate if someone in the community can make a llamacpp-langextract provider. I have been testing with ollama personally but I'd really prefer llama.cpp, because then it can be bundled automatically.
A LiteLLM core provider would be fantastic especially with OpenAI-compatible API endpoint support
Outlines support would be a tremendous value add alongside this https://github.com/google/langextract/issues/101
Hi all,
There's now a one-step script that generates a complete plugin template with all the boilerplate:
python scripts/create_provider_plugin.py MyProvider --with-schema
This should make it much easier to create custom providers. See PR #144 for details.
I've created a langextract-bedrock plugin (initial MR, issue) following the new custom provider guidance. Please leave feedback, or let me know if I should be approaching this differently. Thank you!
Thanks @andyxhadji! Great to see langextract-bedrock as a working example of the plugin system. This will help others creating their own providers. Looking forward to feedback from anyone who tries it out.
@aksg87 @andyxhadji Is there a feature to add LiteLLM as a package? Since LiteLLM is technically supporting ALL packages, how do we manage some of the provider-specific configs, such as the Gemini schema, or the OpenAI fenced outputs? Do we need to track this somehow? @aksg87 if you can provide some insight, and can use the great open example from @andyxhadji
@aksg87 I have the openai plugin up and running (covers openai + azureopenai): https://pypi.org/project/langextract-openai/ https://github.com/JustStas/langextract-openai
Please tell me if there is anything there that is not in line with what you expect community plugins to look like.
I can try to make the LiteLLM work as well - will report back on that
@aksg87 I have the openai plugin up and running: https://pypi.org/project/langextract-openai/ https://github.com/JustStas/langextract-openai
Does this mean I can now comfortably serve any LLMs from Huggingface using vLLM using this openai plugin?
@aksg87 Also created a plugin for litellm - https://github.com/JustStas/langextract-litellm @torchss I would advise to test this one out - it should work with huggingface models provided they are covered by LiteLLM @EliasLumer please check it out and share feedback :)
@JustStas This is phenomenal!
If I use vLLM, SGLANG and llama.cpp to serve OpenAI-compatible API endpoints, are you suggesting I try out your litellm over your openai ? If so:
- Why? (For example: Are you passing through "more options" to litellm over openai and your openai is literally for openai and not just OpenAI-compatible API servers?)
- I will be using the structured generation backends in vLLM, SGLANG which are both custom options to the LLM - would your answer change?
Again, Thank You for doing all this!
@torchss
- the openai plugin is indeed designed only for openai + azure openai
- tbh, if the litellm one works well, I don't see any reason to use the separate openai one (maybe it will be worth deprecating it to avoid multiple entities to maintain). I would expect the functionality and performance to be identical. IMHO, the litellm plugin could be included in the main library but that's above my paygrade and for @aksg87 to decide :)
@aksg87 would it make sense to feature a list of community plugins on the core Readme?
Hi @JustStas, you read my mind :)
I’m working on a template and setup for collecting community plugins that will be linked in the README. I’ll try to set that up very soon, and excited to centralize this in an organized way.
- the openai plugin is indeed designed only for openai + azure openai
- tbh, if the litellm one works well, I don't see any reason to use the separate openai one (maybe it will be worth deprecating it to avoid multiple entities to maintain). I would expect the functionality and performance to be identical. IMHO, the litellm plugin could be included in the main library but that's above my paygrade and for @aksg87 to decide :)
Hey, will definitely look into this soon! Thanks for the suggestions
@aksg87 thanks for adding support for Azure OpenAI. When i run with this config:
Extract with Azure OpenAI
result = lx.extract( text_or_documents=input_text, model_id=deployment_id, api_key=api_key, azure_endpoint=azure_endpoint, prompt_description=prompt, )
receiving this error? TypeError: extract() got an unexpected keyword argument 'azure_endpoint' All variables are defined and pulling from env variables.
@Beenjamming
- I believe if you have queries regarding one of the community plugins it makes sense to ask them on the repo of the respective plugin, not here.
- Please check the examples and documentation in either the openai or litellm plugin - both should enable you to use Azure OpenAI models. You will also notice that both of them differ a different format from what you inputted.
@aksg87 would it make sense to feature a list of community plugins on the core Readme?
This is done now in #182, please take a look and let me know if you think the table and documentation works well for adding references to community plugins. Also, please feel free to add yours to the table. Thanks! @JustStas
- I believe if you have queries regarding one of the community plugins it makes sense to ask them on the repo of the respective plugin, not here.
- Please check the examples and documentation in either the openai or litellm plugin - both should enable you to use Azure OpenAI models. You will also notice that both of them differ a different format from what you inputted.
Thank you will do!
@aksg87 added in #186 I suppose it doesn't make sense to add providers that duplicate the functionality from LiteLLM to avoid user confusion? I think I will skip adding my openai provider there for now.
https://github.com/JustStas/langextract-litellm/issues/1 Thanks! See my response here
On Tue, Aug 19, 2025 at 12:44 PM JustStas @.***> wrote:
JustStas left a comment (google/langextract#99) https://github.com/google/langextract/issues/99#issuecomment-3201466207
@aksg87 https://github.com/aksg87 Also created a plugin for litellm - https://github.com/JustStas/langextract-litellm @torchss https://github.com/torchss I would advise to test this one out
- it should work with huggingface models provided they are covered by LiteLLM https://docs.litellm.ai/docs/#basic-usage @EliasLumer https://github.com/EliasLumer please check it out and share feedback :)
— Reply to this email directly, view it on GitHub https://github.com/google/langextract/issues/99#issuecomment-3201466207, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVG4E4BKNVHXZ4PS3KL3WOL3ONH6VAVCNFSM6AAAAACDNWBNW6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTEMBRGQ3DMMRQG4 . You are receiving this because you were mentioned.Message ID: @.***>