gtfs-validator
gtfs-validator copied to clipboard
feat: Column-based storage for GTFS entities
Per discussion in #1358 and GTFS Validator - Memory Reduction, this PR implements support for column-based storage of GTFS entities. This technique supports reduction in the validators memory footprint by avoiding the memory usage of unused columns.
This PR is not yet ready for review but is meant to show what the implementation might look like.
See the implementation report for details on memory savings and performance.
Please make sure these boxes are checked before submitting your pull request - thanks!
- [ ] Run the unit tests with
gradle test
to make sure you didn't break anything - [ ] Add or update any needed documentation to the repo
- [ ] Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
- [ ] Linked all relevant issues
- [ ] Include screenshot(s) showing how this pull request works and fixes the issue(s)
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.
✅ Rule acceptance tests passed. New Errors: 1 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. Dropped Errors: 2 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. New Warnings: 1 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. Dropped Warnings: 0 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. 0 out of 1520 sources (~0 %) are corrupted. Commit: 337aa15e14f5b5af4d1877fa037a5133cdec7930 Download the full acceptance test report here (report will disappear after 90 days). ✅ Rule acceptance tests passed.
Impressive work. In general I am a bit concerned with the added complexity vs memory savings.