kolibri icon indicating copy to clipboard operation
kolibri copied to clipboard

LFS File pre-commit check

Open rtibbles opened this issue 1 month ago โ€ข 7 comments

Summary

  • Adds a pre-commit check to ensure files that should be checked in as git lfs files are not accidentally committed.
  • Adds a Python script (.pre-commit-hooks/check_lfs_pointers.py) that uses git's native tools to identify and validate LFS files
  • Clear error messages with step-by-step fix instructions when violations are detected

Implementation details:

  • Updates the global exclude in our pre commit config to allow checking of our font files in the core static folder
  • Uses git check-attr filter to identify files marked for LFS (no manual .gitattributes parsing required)
  • Checks staged content (what will be committed) using git show :filepath to verify it starts with the LFS pointer header
  • Works correctly whether or not Git LFS is installed in the environment running the check
  • Python implementation for cross-platform compatibility
  • Only uses Python stdlib (subprocess, logging, sys)
  • Outputs to stderr for proper pre-commit integration

This will prevent the recurring issue where binary files (fonts, fixtures) were accidentally committed as binary data instead of LFS pointers, which has occurred in multiple PRs (#7092, #6802, #6467).

References

Fixes #7099

Reviewer guidance

  • Run git lfs uninstall
  • Run touch test.woff
  • Add it to git and try to commit it
  • See error!

:robot: This was created by Claude Code. @rtibbles then reviewed the generated output, and did iterative rounds of updates before making it ready for review :robot:

rtibbles avatar Nov 26 '25 23:11 rtibbles

@coderabbitai full review

rtibbles avatar Nov 26 '25 23:11 rtibbles

โœ… Actions performed

Full review triggered.

coderabbitai[bot] avatar Nov 26 '25 23:11 coderabbitai[bot]

Walkthrough

This PR adds a Git LFS validation pre-commit hook to detect when binary files are inadvertently committed instead of LFS pointers. It includes a new Python script that checks staged files for LFS attributes and verifies they contain pointer data, plus configuration updates to enable the hook.

Changes

Cohort / File(s) Summary
Pre-commit configuration
\.pre-commit-config\.yaml
Updated exclude regex pattern to explicitly preserve static/assets/fonts directory while excluding other static paths; added check-lfs-pointers hook configuration with entry point to new validation script
LFS validation hook
\.pre-commit-hooks/check_lfs_pointers\.py
New Python script that validates staged files marked with Git LFS filter attribute contain valid LFS pointer headers rather than binary content; includes helper functions for filter detection and pointer verification with subprocess-based git operations and error reporting

Estimated code review effort

๐ŸŽฏ 3 (Moderate) | โฑ๏ธ ~30 minutes

  • Subprocess operations: The script uses subprocess.run() for git commands; verify proper error handling and argument escaping
  • LFS pointer validation logic: Review the header comparison logic (b"version https://git-lfs.github.com/spec/v1") to ensure it correctly identifies valid pointers
  • Exit code handling: Confirm the aggregation of failures and return logic correctly signals hook success/failure to pre-commit framework

Pre-merge checks and finishing touches

โœ… Passed checks (5 passed)
Check name Status Explanation
Title check โœ… Passed The title 'LFS File pre-commit check' accurately and concisely summarizes the main change: adding a pre-commit hook to validate Git LFS files.
Linked Issues check โœ… Passed The PR fully addresses linked issue #7099 by implementing a linter/pre-commit check that detects files marked for LFS but committed as binary, preventing recurring incidents and providing clear error messages.
Out of Scope Changes check โœ… Passed All changes are directly scoped to implementing the LFS validation check: updates to .pre-commit-config.yaml and addition of the check_lfs_pointers.py script.
Docstring Coverage โœ… Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check โœ… Passed The pull request description is directly related to the changeset, clearly explaining the addition of an LFS pre-commit check and implementation details that match the file changes.
โœจ Finishing touches
  • [ ] ๐Ÿ“ Generate docstrings
๐Ÿงช Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment
  • [ ] Commit unit tests in branch claude/investigate-and-plan-01Q7GSNtfpZfq5yGfddewNYH

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

โค๏ธ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Nov 26 '25 23:11 coderabbitai[bot]

@coderabbitai full review

rtibbles avatar Nov 29 '25 17:11 rtibbles

โœ… Actions performed

Full review triggered.

coderabbitai[bot] avatar Nov 29 '25 17:11 coderabbitai[bot]

(we should squash commit this)

rtibbles avatar Dec 01 '25 15:12 rtibbles