Add comprehensive Python testing infrastructure

Open llbbl opened this issue 2 months ago • 0 comments

About UnitSeeker

Hi! This PR is part of the UnitSeeker project, a human-guided initiative to help Python repositories establish testing infrastructure.

Key points:

Human-approved: Every PR is manually approved before work begins
Semi-automated with oversight: Created and controlled via a homegrown wrapper around Claude Code with human quality control
Infrastructure only: This PR intentionally contains only the testing setup without actual unit tests
Your repository, your rules: Feel free to modify, reject, or request changes - all constructive feedback is welcome
Follow-up support: All responses and discussions are personally written, not automated

Learn more about the project and see the stats on our progress at https://unitseeker.llbbl.com/

Summary

This PR adds a comprehensive testing infrastructure to the StarCoder project, providing a foundation for writing and running tests with minimal friction.

Changes Made

Package Management

Poetry Setup: Migrated from requirements.txt to Poetry for modern Python dependency management
Dependency Migration: All existing dependencies from requirements.txt preserved in pyproject.toml
Testing Dependencies: Added pytest, pytest-cov, and pytest-mock as development dependencies
Lock File: Generated poetry.lock for reproducible builds across environments

Testing Configuration

pytest Configuration: Comprehensive pytest setup in pyproject.toml with:
- Test discovery patterns for files, classes, and functions
- Coverage tracking for chat/ and finetune/ packages
- HTML, XML, and terminal coverage reporting
- Strict markers and config enforcement
- Custom markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.slow
- Coverage threshold placeholder (80%) ready to be enabled

Directory Structure

tests/
├── __init__.py
├── conftest.py              # Shared fixtures
├── test_infrastructure.py   # Infrastructure validation tests
├── unit/
│   └── __init__.py
└── integration/
    └── __init__.py

Shared Test Fixtures

The tests/conftest.py file provides reusable fixtures for common testing scenarios:

temp_dir - Temporary directory for file operations
temp_file - Temporary test file
mock_config - Mock configuration dictionary
mock_tokenizer - Mock HuggingFace tokenizer
mock_model - Mock model for inference testing
sample_text & sample_code_snippet - Sample data
mock_dataset - Mock dataset for data pipeline testing
mock_training_args - Mock training configuration
mock_huggingface_hub - Mock HuggingFace Hub client
reset_environment - Auto-cleanup for environment variables

Validation Tests

Created test_infrastructure.py with 27 tests to validate:

✅ pytest is working correctly
✅ Python version requirements (3.8+)
✅ Project structure is correct
✅ Custom markers are registered and functional
✅ All fixtures work as expected
✅ Coverage and mocking packages are installed
✅ Test parametrization works

Development Commands

Two convenient shortcuts for running tests:

poetry run test    # Run all tests
poetry run tests   # Alternative command (both work)

Updated .gitignore

Added entries for:

Claude Code artifacts (.claude/)
VS Code settings (.vscode/, *.code-workspace)
Testing artifacts already covered by default Python gitignore

Running Tests

Setup

# Install dependencies (first time only)
poetry install

# Run all tests
poetry run pytest

# Run with verbose output
poetry run pytest -v

# Run only unit tests
poetry run pytest -m unit

# Run only integration tests
poetry run pytest -m integration

# Run tests without coverage
poetry run pytest --no-cov

# Generate and view coverage report
poetry run pytest
open htmlcov/index.html  # View HTML coverage report

Using the shortcuts

poetry run test      # Runs pytest with all configured options
poetry run tests     # Alternative (same as above)

Notes

Coverage Threshold

The 80% coverage threshold is currently commented out in pyproject.toml:

# Uncomment to enforce coverage threshold once actual tests are written:
# "--cov-fail-under=80",

Uncomment this line when you're ready to enforce minimum coverage requirements.

No Actual Unit Tests

This PR intentionally does not include unit tests for the codebase. The goal is to provide:

A working testing infrastructure
Validated configuration
Useful fixtures and examples
Clear documentation

Developers can immediately start writing tests without any setup overhead.

Poetry vs pip

Poetry was chosen for:

Modern dependency resolution
Automatic virtual environment management
Lockfile for reproducible builds
Built-in support for dev dependencies
Simple script shortcuts

If you prefer to stick with requirements.txt, you can extract dependencies:

poetry export -f requirements.txt --output requirements.txt

Dependencies

All existing dependencies preserved:

tqdm==4.65.0
transformers==4.28.1
datasets==2.11.0
huggingface-hub==0.13.4
accelerate==0.18.0

Testing the Infrastructure

The infrastructure has been validated with 27 passing tests:

============================= 27 passed in 0.28s =============================

All features confirmed working:

✅ Test discovery and execution
✅ Coverage reporting (HTML, XML, terminal)
✅ Custom markers
✅ All fixtures
✅ Test parametrization
✅ Mocking capabilities

Next Steps

Start Writing Tests: Use the fixtures in conftest.py as building blocks
Organize Tests: Place unit tests in tests/unit/, integration tests in tests/integration/
Use Markers: Tag tests with @pytest.mark.unit, @pytest.mark.integration, or @pytest.mark.slow
Enable Coverage Threshold: Uncomment --cov-fail-under=80 when ready
Customize: Modify pyproject.toml settings to match your team's preferences

Questions?

Feel free to ask questions, request changes, or suggest improvements. All feedback is welcome and appreciated!

Oct 18 '25 01:10 llbbl