ramalama Add Provider Abstraction with support for Hosted API Calls

Summary by Sourcery

Introduce a provider-agnostic chat abstraction and hosted API transport support while updating file loading and testing to use structured chat messages.

New Features:

Add generic chat provider interfaces and OpenAI-specific implementations for both local-compatible and hosted APIs.
Introduce an API transport for routing chat calls directly to hosted providers based on model URL schemes (e.g., openai://).

Enhancements:

Refactor chat handling to use structured ChatMessage/parts, provider-based request creation, and unified streaming response parsing.
Extend configuration with provider-specific settings (such as OpenAI API key resolution) and wire them into API transports.
Adapt file loader and chat integration logic to emit and consume structured message parts for text and images.
Update transports and CLI to recognize API-based models, disallow incompatible rag/serve modes, and route run commands through the new API transport.

Tests:

Add and update unit tests for file loaders, chat integration, transport factory behavior, and the new API transport and provider abstractions.

Nov 25 '25 23:11 ieaves

Reviewer's Guide

Introduces a provider-agnostic chat abstraction and hosted API transport (starting with OpenAI), refactors chat message handling to a structured parts-based model, and updates file loaders, CLI, and tests to use the new abstractions and support hosted API models like openai://gpt-4o-mini.

File-Level Changes

Change	Details	Files
Refactor chat shell to use provider-based API calls and structured ChatMessage history instead of raw dicts.	Add ChatProvider injection into RamaLamaShell and chat(), defaulting to OpenAIChatProvider using args.url and api_key. Replace direct urllib payload construction in _make_api_request and _make_request_data with provider.create_request using ChatRequestOptions. Switch conversation_history and summarization logic to operate on ChatMessage objects and helper formatting methods. Update MCP integration to use a serializable history snapshot derived from ChatMessage instances. Replace legacy streaming helper with stream_response that delegates parsing to the provider.	`ramalama/chat.py` `ramalama/chat_utils.py` `ramalama/chat_providers/base.py` `ramalama/chat_providers/openai.py` `ramalama/chat_providers/__init__.py`
Introduce hosted API transport support (e.g., openai:// models) and integrate with CLI run/serve flows.	Add APITransport and APIProviderSpec to proxy chat to hosted providers without starting a local server. Extend TransportFactory to detect API schemes, create APITransport, and prune model inputs for API models. Wire CLI run_cli and serve_cli to disallow rag/serve for API transports and to avoid building server commands for them. Expose API transport symbols from transports package and extend MODEL_TYPES list to include openai. Add unit tests for APITransport behavior and transport factory detection/pruning for openai:// models.	`ramalama/transports/api.py` `ramalama/transports/transport_factory.py` `ramalama/transports/base.py` `ramalama/transports/__init__.py` `ramalama/cli.py` `test/unit/test_api_transport.py` `test/unit/test_transport_factory.py`
Adopt ChatMessage and message-part model across file loaders and chat/file integration tests.	Change OpanAIChatAPIMessageBuilder.load to return ChatMessage objects with TextPart and ImageURLPart parts instead of dicts. Update FileManager to build system ChatMessage instances for text and image content using TextPart/ImageURLPart. Refactor file loader unit tests to assert on ChatMessage.role, parts, and helper extractors instead of raw content dicts. Adjust chat/file integration tests to inspect ChatMessage-based conversation history and content extraction helpers.	`ramalama/file_loaders/file_manager.py` `test/unit/test_file_loader.py` `test/unit/test_file_loader_with_data.py` `test/unit/test_file_loader_integration.py` `ramalama/chat_utils.py`
Add shared chat utilities and configuration for providers and minor config cleanups.	Introduce chat_utils module with ChatMessage, typed message parts, stream_response, default_prefix, add_api_key, and serialization helpers. Add ProviderConfig to BaseConfig to hold provider-specific settings like openai_api_key, and wire it into CONFIG. Make minor config code cleanups (list literal formatting, remove extraneous blank lines in loops).	`ramalama/chat_utils.py` `ramalama/config.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an issue from a review comment by replying to it. You can also reply to a review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull request title to generate a title at any time. You can also comment @sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment @sourcery-ai summary on the pull request to (re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

Nov 25 '25 23:11 sourcery-ai[bot]

Summary of Changes

Hello @ieaves, I'm Gemini Code Assist[^1]! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the chat and API interaction components of the application. The primary goal is to introduce a robust and extensible abstraction for handling various chat API providers, making it easier to add support for new services and maintain existing ones. This involves creating a new set of structured data models for chat messages and a dedicated API transport layer, which streamlines how the application communicates with external language models.

Highlights

Abstracted Chat Providers: Introduced a new abstraction layer for chat providers, allowing for easier integration of different API services (e.g., OpenAI) by standardizing request and response handling. This includes base classes for ChatProvider, ChatRequestOptions, and ChatStreamEvent.
Structured Chat Messages: Implemented new ChatMessage and MessagePart dataclasses to represent conversation history and message content in a more structured and type-safe manner, replacing previous dictionary-based message formats.
API Transport Integration: Added an APITransport mechanism that enables direct connection to hosted API providers, bypassing local server setup for supported services. This is integrated into the CLI and transport factory.
Refactored API Calling Logic: The core API request and response streaming logic in ramalama/chat.py has been refactored to leverage the new chat provider abstraction, leading to cleaner and more modular code.
Configuration Updates: The configuration system (ramalama/config.py) now includes a ProviderConfig to manage API keys and settings for different chat providers, such as openai_api_key.
Utility Function Relocation: Common utility functions like stream_response, default_prefix, and add_api_key have been moved from ramalama/chat.py to a new ramalama/chat_utils.py module for better organization and reusability.
File Loader Modernization: The OpanAIChatAPIMessageBuilder in ramalama/file_loaders/file_manager.py has been updated to produce ChatMessage objects, aligning with the new structured message format.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with :thumbsup: and :thumbsdown: on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

[^1]: Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Nov 25 '25 23:11 gemini-code-assist[bot]

@rhatdan would you be willing to take a look at this? The primary goal was supporting multiple API backends with different chat implementations but core improvements include:

Generic chat provider interfaces to provide compatibility with different API interfaces beyond static /v1/chat/completions
Implementation of the Responses interface with an example implementation of model calling to a remote API (openai, in this case).
A new Spinner class which spins GUI updates off into a separate thread. The previous implementation coupled the loading animations to api calls
Updated documentation and associated tests

Dec 02 '25 06:12 ieaves

@olliewalsh PTAL

Dec 03 '25 14:12 rhatdan

Hey @olliewalsh is there any chance you had an opportunity to look at this?

Dec 09 '25 21:12 ieaves

Thanks so much @olliewalsh! I've pushed changes for all of the comments except run. My instinct is that from a user journey perspective allowing users to "run" a hosted api is still the right path but I understand the potential risk of confusion depending on how we read the term. I'm not adamant at all, I just wanted to raise the point and get your thoughts.

Dec 10 '25 19:12 ieaves

/retest bats-on-pull-request

Dec 11 '25 15:12 mikebonnet