cookbook
cookbook copied to clipboard
feat: notebook tester + CLI + report summary
Overview
This PR introduces automated testing of Jupyter notebooks within the repository. It adds a shell entrypoint (cookbook) and a Python test runner (test_nbclient.py) that:
- Automatically discovers and executes notebooks.
- Patches Colab-specific code for local execution.
- Summarizes cell outputs and errors.
- Optionally compares notebook outputs using Gemini AI for regression detection.
- Generates detailed JSON reports for each notebook test run.
Features
- Notebook Discovery: Automatically finds all .ipynb files in the repo, or runs specific notebooks.
- Colab Compatibility: Patches
google.colab.userdata.getcalls to use environment variables for seamless local runs. - Cell Output Summarization: Captures and summarizes outputs, including errors, for each code cell.
- Progress UI: Displays real-time progress and summary of test results in the terminal.
- AI Output Comparison: When enabled, uses Gemini AI to classify output changes as
ok_cells,slightly_changed, orwrong. - Reporting: Outputs results to
reports/*.compare.jsonfor further analysis.
Usage
Entrypoint
Use the cookbook script to run notebook tests:
# Run all notebooks
./cookbook test
# Run a specific notebook
./cookbook test examples/Book_illustration.ipynb
# Run multiple files
./cookbook test "quickstarts/Models.ipynb","quickstarts/Audio.ipynb"
# Run with AI output comparison
./cookbook test examples/Book_illustration.ipynb --ai-compare
# Set a custom timeout (seconds)
./cookbook test examples/Book_illustration.ipynb --timeout=1200
# Specify a kernel (default: python3)
./cookbook test examples/Book_illustration.ipynb --kernel=python3
Options
--ai-compare: Enables AI-based output comparison.--timeout=<seconds>: Sets cell execution timeout (default: 900).--kernel=<name>: Specifies Jupyter kernel (default: python3).[notebook.ipynb]: Path to a specific notebook. If omitted, all notebooks are tested.
Output
- Progress and summary are displayed in the terminal.
- Detailed results are saved as JSON in the
reports/directory, e.g.,reports/examples__Book_illustration.ipynb.compare.json.
Example Report
Each report includes:
- File path
- Duration
- Status (
passed/failed) - Buckets for cell output comparison (
ok_cells,slightly_changed,wrong) - AI notes (if enabled)
- Saved and test run outputs for
Notes
- Ensure
GOOGLE_API_KEYandGEMINI_API_KEYare set in your environment.
Work Still Needed
- Verify that all required packages are installed using the
pip install <package>format. - Remove leftover debug variables (e.g.,
raw_textsin_gemini_compare_batchesandimg_countin_summarize_outputs). - Ensure non-
.ipynbfiles are rejected. - Convert the process into a weekly workflow, with reports automatically linked in the corresponding weekly issue.