codefuse-evaluation
codefuse-evaluation copied to clipboard
feat: Add comprehensive Python testing infrastructure
Add Comprehensive Python Testing Infrastructure
Summary
This PR establishes a complete testing infrastructure for the CodeFuse evaluation framework, transitioning from basic requirements.txt dependency management to a modern Poetry-based setup with comprehensive testing capabilities.
Changes Made
Package Management
- ✅ Set up Poetry as the package manager with
pyproject.toml - ✅ Migrated all dependencies from
requirements.txtto Poetry format - ✅ Added testing dependencies:
pytest,pytest-cov,pytest-mockas dev dependencies
Testing Configuration
- ✅ Configured pytest with comprehensive settings in
pyproject.toml:- 80% coverage threshold requirement
- HTML, XML, and terminal coverage reporting
- Custom test markers:
unit,integration,slow - Strict configuration and marker enforcement
- ✅ Coverage settings configured to track
codefuseEvalandcodefuseEval_202503packages - ✅ Exclusions set up for data, docker, figures, and other non-code directories
Directory Structure
- ✅ Created
tests/directory with proper__init__.pyfiles - ✅ Added
tests/unit/andtests/integration/subdirectories for organized testing - ✅ Comprehensive
conftest.pywith shared fixtures including:- Temporary directory and file system fixtures
- Sample data generation (DataFrames, numpy arrays, JSONL files)
- Mock models, tokenizers, and datasets
- Environment cleanup utilities
- Common test data and code snippets
Infrastructure Validation
- ✅ Created validation tests (
test_infrastructure_validation.py) that verify:- Pytest functionality and fixture availability
- Coverage configuration correctness
- Mock functionality and test markers
- File operations and data fixtures
Additional Setup
- ✅ Updated
.gitignorewith testing-related entries and Claude Code settings - ✅ Poetry scripts configured:
poetry run testandpoetry run testscommands - ✅ Package mode disabled to avoid installation issues during development
Dependencies
All original dependencies from requirements.txt have been preserved and migrated to Poetry format with appropriate version constraints compatible with Python 3.8-3.11.
Key Testing Dependencies Added:
pytest ^7.4.0- Main testing frameworkpytest-cov ^4.1.0- Coverage reportingpytest-mock ^3.11.1- Advanced mocking utilities
Running Tests
Basic Test Execution
# Run all tests
poetry run pytest
# Run with verbose output
poetry run pytest -v
# Run specific test file
poetry run pytest tests/test_infrastructure_validation.py
# Run tests with coverage
poetry run pytest --cov
Using Custom Markers
# Run only unit tests
poetry run pytest -m unit
# Run only integration tests
poetry run pytest -m integration
# Exclude slow tests
poetry run pytest -m "not slow"
Coverage Reports
- HTML Report: Generated in
htmlcov/directory - XML Report: Generated as
coverage.xml - Terminal: Shows missing lines and coverage percentage
Validation Results
✅ All 14 validation tests pass successfully
✅ Testing infrastructure is fully functional
✅ Fixtures and markers work correctly
✅ Coverage reporting configured properly
Next Steps
The testing infrastructure is now ready for development teams to:
- Start writing unit tests in
tests/unit/for individual functions and classes - Add integration tests in
tests/integration/for end-to-end workflows - Use shared fixtures from
conftest.pyfor common test data and mocks - Run tests locally before committing code changes
- Monitor coverage to maintain the 80% threshold requirement
Notes
- Python Version: Configured for Python 3.8-3.11 compatibility
- Full Dependency Migration: All original dependencies preserved and working
- No Breaking Changes: Existing code functionality unchanged
- Ready for CI/CD: Infrastructure prepared for continuous integration setup
🤖 Generated with Claude Code