baml
baml copied to clipboard
Please provide feedback on your OpenAPI experience with BAML
Please leave any and all feedback about your experience using baml via our OpenAPI generators on this issue!
We're actively working on this and expect that we will have to do work to address pain points that users run into (including if, say, you're having trouble installing npx and would rather we provide a universal installer). We'll be prioritizing work based on whether or not someone runs into it, and will update this issue when we do!
Re stability: we're actively working on the OpenAPI generator and may need to make backwards-breaking changes to stabilize the API.
Open questions:
- reserved namespace members (e.g. exposing the streaming API may cause potential namespace conflicts)
- semantics of optional fields (we currently don't use OpenAPI 3.0.x's
nullable, nor do we use 3.1.x'soneOf: [type, 'null']) - how we'll expose raw API responses
- how we'll expose dynamic types or tracing
- exceptions, including user-caused exceptions, are currently returned as 5xx and not 4xx
If we have to make a backwards-breaking change to stabilize our OpenAPI support, we'll be sure to update this issue and work with any affected users to make sure that their use case will be supported moving forward.
At the moment the OpenAPI key is read from an ENV variable. Would it be possible to add support to pass this key as a header on the call to the OpenAPI RESTful endpoint.
We have a requirement where there is a proxy in from our our LLM accounts that exposes oauth authentication. So the token is only valid for 1 hour.
At the moment the OpenAPI key is read from an ENV variable. Would it be possible to add support to pass this key as a header on the call to the OpenAPI RESTful endpoint. We have a requirement where there is a proxy in from our our LLM accounts that exposes oauth authentication. So the token is only valid for 1 hour.
@k-brady-sap thats a great idea. I think likely we can just add support for our ClientRegistry and this will solve it. https://docs.boundaryml.com/docs/calling-baml/client-registry
Would that work for you?
@hellovai Yes that approach will probably work all right. Will be it ok to create a new llm client for every request? We might also need to pass additional headers that contain request specific information (e.g. tenantID) - so would end up creating a new client for each request. Would we need to delete the client after the LLM call to ensure that the client registry doesn't get full up of obsolete clients?
EDIT: Solved by installing mvn with brew install mvn - turns out mvn command was not in the path and I was using the openapi maven dependency that assumes it is on the PATH.
Hi @hellovai and team. I always get a message that says Error generating clients: Client generation failed (screenshot below), any ideas on what I can try? I am running it with npx @boundaryml/baml dev --preview
I have also tried changing the version to 0.56.1 while troubleshooting it with a colleague but saw no changes.
Contents of my generators.baml:
// This helps use auto generate libraries you can use in the language of
// your choice. You can have multiple generators if you use multiple languages.
// Just ensure that the output_dir is different for each generator.
generator target {
// Valid values: "python/pydantic", "typescript", "ruby/sorbet", "rest/openapi"
output_type "rest/openapi"
// Where the generated code will be saved (relative to baml_src/)
output_dir "../"
// The version of the BAML package you have installed (e.g. same version as your baml-py or @boundaryml/baml).
// The BAML VSCode extension version should also match this version.
version "0.55.3"
// 'baml-cli generate' will run this after generating openapi.yaml, to generate your OpenAPI client
// This command will be run from within $output_dir
on_generate "openapi-generator generate -i openapi.yaml -g java -o . --additional-properties invokerPackage=com.boundaryml.baml_client,modelPackage=com.boundaryml.baml_client.model,apiPackage=com.boundaryml.baml_client.api,java8=true && cd ../baml_client && mvn clean install"
// Valid values: "sync", "async"
default_client_mode "sync"
}
Some additional info on the version of my openapi-generator and npm:
@hellovai Yes that approach will probably work all right. Will be it ok to create a new llm client for every request? We might also need to pass additional headers that contain request specific information (e.g. tenantID) - so would end up creating a new client for each request. Would we need to delete the client after the LLM call to ensure that the client registry doesn't get full up of obsolete clients?
yep thats completely ok! and our clients allow you to pass in headers: see the headers property here https://docs.boundaryml.com/docs/snippets/clients/providers/openai
EDIT: Solved
Glad you were able to solve this @lily-sap !
would it be possible to allow the log output of the BAML service to be logged as json instead of plain text. The Baml service will be running in a pod in kubernetes and the logs of all the pods are sent to Kibana. Logging in Json format would make them much more searchable in kibana. Thanks.
@k-brady-sap just curious, which language are you using BAML with? That's a good idea -- perhaps we add a flag to configure this.
@k-brady-sap just curious, which language are you using BAML with? That's a good idea -- perhaps we add a flag to configure this.
We're using Java. Yes a flag or env variable will work to configure this.
ok, I'll get a release out by Monday to enable a preview of this feature
@k-brady-sap can you try version 0.66.0?
You should be able to see baml logs as json using BAML_LOG_JSON=1
This is the schema we emit on each request in baml_event in the log
struct BamlEventJson {
// Metadata
start_time: String,
num_tries: usize,
total_tries: usize,
// LLM Info
client: String,
model: String,
latency_ms: u128,
stop_reason: Option<String>,
// Content
prompt: Option<String>,
llm_reply: Option<String>,
// JSON string
request_options_json: Option<String>,
// Token Usage
tokens: Option<TokenUsage>,
// Response/Error Info
parsed_response_type: Option<String>,
parsed_response: Option<String>,
error: Option<String>,
}
#[derive(Valuable)]
struct TokenUsage {
prompt_tokens: Option<u64>,
completion_tokens: Option<u64>,
total_tokens: Option<u64>,
}
some notes:
- Expect this interface to remain unstable for a bit -- we will give you as much advance notice as we can before we change the schema.
- Let us know if this schema works, or if we need to iterate on it
@aaronvg Thanks for such a fast turnaround on this topic. I'll update our code to use the latest version and test it out. Cheers.
@aaronvg This is what the Json output looks like. It looks like the json object is getting serialized as part of a plain text log output..
I have set two env variables like this in docker-compose file
BAML_LOG: "INFO"
BAML_LOG_JSON: 1
FROM node:22
WORKDIR /app
COPY baml_src/ baml_src/
# If you want to pin to a specific version (which we recommend):
# RUN npm install -g @boundaryml/baml@VERSION
RUN npm install -g @boundaryml/[email protected]
USER node
CMD ["baml-cli", "serve", "--preview", "--port", "2024"]
2024-11-08 15:25:28 [2024-11-08T15:25:28Z WARN baml_runtime::cli::serve] BAML-over-HTTP is a preview feature.
2024-11-08 15:25:28
2024-11-08 15:25:28 Please provide feedback and let us know if you run into any issues:
2024-11-08 15:25:28
2024-11-08 15:25:28 - join our Discord at https://docs.boundaryml.com/discord, or
2024-11-08 15:25:28 - comment on https://github.com/BoundaryML/baml/issues/892
2024-11-08 15:25:28
2024-11-08 15:25:28 We expect to stabilize this feature over the next few weeks, but we need
2024-11-08 15:25:28 your feedback to do so.
2024-11-08 15:25:28
2024-11-08 15:25:28 Thanks for trying out BAML!
2024-11-08 15:25:28
2024-11-08 15:25:28 [2024-11-08T15:25:28Z INFO baml_runtime::cli::serve] BAML-over-HTTP listening on port 2024, serving from ./baml_src
2024-11-08 15:25:28
2024-11-08 15:25:28 Tip: test that the server is up using `curl http://localhost:2024/_debug/ping`
2024-11-08 15:25:28
2024-11-08 15:25:28 (You may need to replace "localhost" with the container hostname as appropriate.)
2024-11-08 15:25:28
2024-11-08 15:26:13 [2024-11-08T15:26:13Z WARN baml_runtime::cli::serve] BAML_PASSWORD not set, skipping auth check
2024-11-08 15:26:19 [2024-11-08T15:26:19Z INFO baml_events] baml_event=BamlEventJson { start_time: "2024-11-08T15:26:13.507Z", num_tries: 1, total_tries: 1, client: "OpenAI", model: "gpt-4o-mini", latency_ms: 6177, stop_reason: "stop", prompt: "......", llm_reply: ".....", request_options_json: "{\"max_tokens\":4096}", tokens: TokenUsage { prompt_tokens: 1055, completion_tokens: 743, total_tokens: 1798 }, parsed_response_type: ".....", parsed_response: ".....", error: () }
ah I missed one configuration on our end for the release -- let me re-test and release a patch. Will also be released by Monday
I just released another patch in 0.68.0 to fix this
I just released another patch in 0.68.0 to fix this
Thanks - yes the latest version works great and the logs are displaying correctly in Kibana.
Is there a way to set temperature per request without modifying whole client via Client Registry?
Was there any breaking change after version 0.68.0 that would affect OpenApi endpoints?
Our OpenApi code works correctly with 0.68.0. but if I try and upgrade to any version later than that then I just get an error back from Baml OpenApi endpoint
{
"error": "invalid_argument",
"message": "Failed to parse __baml_options__",
"documentation_url": "https://docs.boundaryml.com/get-started/debugging/exception-handling"
}
Nothing really gets logged in the Baml logs to indicate what is causing the parsing error for baml_options object.
Thanks for flagging this @k-brady-sap ! I don't think we had any breaking changes. I'll investigate to figure out what it might be.
Can you share your baml_options object you're passing in? (regardless, there's a clear action item for us to provide a better error message here).
Meanwhile I'll dig through the commits and see if i spot any diffs here
Perfect, Bug found! @k-brady-sap https://github.com/BoundaryML/baml/pull/1428
This will go out likely on Monday's release (unless you need it sooner?).
awesome. Thanks a mill. No there is no rush on the fix. We are still on 0.68.0 and everything works correctly so we're not blocked. Once this is released, we can upgrade to latest version and validate that everything works ok.
Thanks for the quick turnaround on investigating the issue.
Hi - is there a flag/setting to set so that the user prompt is not logged to the BAML logs when BAML encounters an error? i.e. if the user prompt is sensitive, we dont want to log it in the logs. But we still want to get details that BAML encountered an error - e.g. LLM service was not available. BAML would log an error but no user input information would appear in the logs.
from @hellovai:
Yes! You can use BAML_LOG=off
Hi - is there a flag/setting to set so that the user prompt is not logged to the BAML logs when BAML encounters an error? i.e. if the user prompt is sensitive, we dont want to log it in the logs. But we still want to get details that BAML encountered an error - e.g. LLM service was not available. BAML would log an error but no user input information would appear in the logs.
Sorry! Misread earlier:
BAML_LOG_MAX_MESSAGE_LENGTH
This env var may work for this scenario: https://docs.boundaryml.com/ref/baml_client/config
Hi - is there a flag/setting to set so that the user prompt is not logged to the BAML logs when BAML encounters an error? i.e. if the user prompt is sensitive, we dont want to log it in the logs. But we still want to get details that BAML encountered an error - e.g. LLM service was not available. BAML would log an error but no user input information would appear in the logs.
Sorry! Misread earlier:
BAML_LOG_MAX_MESSAGE_LENGTH
This env var may work for this scenario: https://docs.boundaryml.com/ref/baml_client/config
Thanks for the quick reply.
Unfortunately it doesn't appear to be working for me.
BAML_LOG_MAX_MESSAGE_LENGTH: 1
I'm using version 0.90.2
To trigger a BAML error, I'm passing an invalid URL for openAI endpoint.
I then get a WARN message in the BAML console
{"timestamp":"2025-07-18T19:03:40.106","level":"WARN","function_name":"callLLM","start_time":"2025-07-18T19:03:24.713Z","num_tries":4,"total_tries":4,"client":"OpenAI","model":"","latency_ms":220,"stop_reason":null,"prompt":{"Chat":[{"role":"system","allow_duplicate_role":false,"parts":[{"Text":"Given.......
If I try and upgrade to version 200.0 or higher, I'm getting an error generating the java client from the openapi.yaml file. We're using the maven openapi-generator-maven-plugin plugin instead of generating it through the baml library. (Need to check if anything wrong on our side but same config has worked great for all previous baml versions)
does 0.202.1 also have the openapi generator issue? We had a bug on 0.200 and potentially 0.201 with openapi generator
does 0.202.1 also have the openapi generator issue? We had a bug on 0.200 and potentially 0.201 with openapi generator
Yes also getting openapi issue with the latest 0.202.1 version.
Seams to be related to this snippet in the generated openapi.yaml?
BamlOptions:
type: object
properties:
client_registry:
$ref: '#/components/schemas/ClientRegistry'
required:
we're taking a look now! Will be patched soon. The bug about the truncation of logs not working is also something we'll look at.
Could you try 0.203.1? We fixed an issue with environment variables and OpenAPI: https://github.com/BoundaryML/baml/pull/2170
Hi, I currently get the following error when trying to re-generate the OpenAPI spec:
Failed to generate BAML client: output directory contains a file that BAML did not generate Please remove it and re-run codegen. File: <repo>/rest/baml_client/.gitignore
Even though the .gitignore is generated by BAML (it re-appears on generate when I remove it manually before). My generator config:
generator rest {
output_type "rest/openapi"
output_dir "../rest"
version "0.204.0"
default_client_mode async
}