crawl4ai feat: Playground enhancement: Ability to play with structured output from /llm endpoint

feat: Playground enhancement: Ability to play with structured output from /llm endpoint

Open jeremygiberson opened this issue 7 months ago • 0 comments

Pardon the PR w/out a corresponding Issue -- I hacked together these changes to scratch my own itch for a project I started this evening, and figured I'd throw up a PR to see if there was interest in getting these changes merged into the project.

Summary

I checked out crawl4ai for the first time tonight. The first thing I tried was the playground and was surprised when my cleverly crafted question ("... respond in JSON format") didn't do anything. At first I thought I broke something. But I dug into the source only to find out that my calls to the /llm endpoint were working but the fetch response wasn't being used to update the response output UI. On further inspection, I noticed that the fetch response didn't quite return a structured output -- rather it was in the form of a markdown code block.

Upon investigation I found that the /llm playground was lacking in some functionality I was just expecting to be available.

This PR updates the frontend and backend code for the /llm endpoint and provides the ability to be able to ask for structured output. It also maintains existing simple question/answer functionality but also now updates the UI with response from the call.

Along the way, I tweaked a few things that caused problems for me during development of my changes.

I'm open for feedback on any of these changes and am willing to tweak/polish (or in some cases undo) the changes I made to get it to a desirable state for acceptance.

List of files changed and why

deploy/docker/static/playground/index.html
- [x] Updated UI for the /llm option to add a new toggle to enable structured output. When enabled, a text input for a sample json is presented to the user. Also
- [x] Update UI when /llm response is received and set contents of response code element to the to the response content. (Applies to either structured or unstructured output mode)
deploy/docker/server.py
- [x] change the /llm endpoint from a GET to POST to facilitate sample_json in payload
- Hoping someone can verify if the endpoint was used by anything other than the playground as those would potentially need to be updated. I didn't notice any @MCP decorator on the endpoint so I'm hoping that was an indicator it was just serving the playground
deploy/docker/api.py
- [x] update handle_llm_qa to accommodate the new payload shape and perform the new structured output functionality or the old Q&A functionality based on the presence of sample json data
- [x] parse the provided sample_json data and generate a json schema for use in getting structured output
- [x] returns a tuple now to expose the generated json schema back to the playground (not currently utilized but could potentially be exposed to help generate call config)
[x] deploy/docker/requirements.txt - added genson to facilitate generating json schema from just a sample json
Dockerfile
- [x] quality of life change to reduce turn around time when testing changes docker compose up --build -d was my goto for testing changes
docker-compose.yml
- [x] removed the environment config block as it was preventing the env_file '.llm.env' from actually being respected

How Has This Been Tested?

Just a bunch of manual testing of the playground, and docker build workflow. Below are some screenshots showing impact of the changes.

Screenshot of UI showing new structured output mode on the llm endpoint

Screenshot of UI showing old behavior enhanced with response actually being displayed

Docker build time reduced to 1.5s when just making api/playground changes

Checklist:

[x] My code follows the style guidelines of this project
[x] I have performed a self-review of my own code
[x] I have commented my code, particularly in hard-to-understand areas
[x] I have made corresponding changes to the documentation
- I'm willing to add documentation if my changes are deemed desirable
~~I have added/updated unit tests that prove my fix is effective or that my feature works~~
~~New and existing unit tests pass locally with my changes~~
- 🤷 I didn't see any related tests to the existing llm endpoint or playground

May 03 '25 09:05 jeremygiberson

crawl4ai crawl4ai copied to clipboard

feat: Playground enhancement: Ability to play with structured output from /llm endpoint

Summary

List of files changed and why

How Has This Been Tested?

Checklist:

crawl4ai
crawl4ai copied to clipboard