cookbook icon indicating copy to clipboard operation
cookbook copied to clipboard

feat: notebook tester + CLI + report summary

Open andycandy opened this issue 3 months ago • 1 comments

Overview

This PR introduces automated testing of Jupyter notebooks within the repository. It adds a shell entrypoint (cookbook) and a Python test runner (test_nbclient.py) that:

  • Automatically discovers and executes notebooks.
  • Patches Colab-specific code for local execution.
  • Summarizes cell outputs and errors.
  • Optionally compares notebook outputs using Gemini AI for regression detection.
  • Generates detailed JSON reports for each notebook test run.

Features

  • Notebook Discovery: Automatically finds all .ipynb files in the repo, or runs specific notebooks.
  • Colab Compatibility: Patches google.colab.userdata.get calls to use environment variables for seamless local runs.
  • Cell Output Summarization: Captures and summarizes outputs, including errors, for each code cell.
  • Progress UI: Displays real-time progress and summary of test results in the terminal.
  • AI Output Comparison: When enabled, uses Gemini AI to classify output changes as ok_cells, slightly_changed, or wrong.
  • Reporting: Outputs results to reports/*.compare.json for further analysis.

Usage

Entrypoint

Use the cookbook script to run notebook tests:

# Run all notebooks
./cookbook test

# Run a specific notebook
./cookbook test examples/Book_illustration.ipynb

# Run multiple files
./cookbook test "quickstarts/Models.ipynb","quickstarts/Audio.ipynb"

# Run with AI output comparison
./cookbook test examples/Book_illustration.ipynb --ai-compare

# Set a custom timeout (seconds)
./cookbook test examples/Book_illustration.ipynb --timeout=1200

# Specify a kernel (default: python3)
./cookbook test examples/Book_illustration.ipynb --kernel=python3

Options

  • --ai-compare: Enables AI-based output comparison.
  • --timeout=<seconds>: Sets cell execution timeout (default: 900).
  • --kernel=<name>: Specifies Jupyter kernel (default: python3).
  • [notebook.ipynb]: Path to a specific notebook. If omitted, all notebooks are tested.

Output

  • Progress and summary are displayed in the terminal.
  • Detailed results are saved as JSON in the reports/ directory, e.g., reports/examples__Book_illustration.ipynb.compare.json.

Example Report

Each report includes:

  • File path
  • Duration
  • Status (passed/failed)
  • Buckets for cell output comparison (ok_cells, slightly_changed, wrong)
  • AI notes (if enabled)
  • Saved and test run outputs for

Notes

  • Ensure GOOGLE_API_KEY and GEMINI_API_KEY are set in your environment.

andycandy avatar Aug 31 '25 08:08 andycandy

Work Still Needed

  • Verify that all required packages are installed using the pip install <package> format.
  • Remove leftover debug variables (e.g., raw_texts in _gemini_compare_batches and img_count in _summarize_outputs).
  • Ensure non-.ipynb files are rejected.
  • Convert the process into a weekly workflow, with reports automatically linked in the corresponding weekly issue.

andycandy avatar Aug 31 '25 09:08 andycandy