gtfs-validator
gtfs-validator copied to clipboard
feat: reformat trip and shape dist validator
Summary:
Resolves #1613 and #1611 by implementing a distance threshold for the trip_distance_exceeds_shape_distance
notice.
Expected Behavior:
- Introduces a 11.1m threshold for
trip_distance_exceeds_shape_distance
, triggering anERROR
for distances $\geq 11.1m$. - Creates a new notice,
trip_distance_exceeds_shape_distance_below_threshold
, withWARNING
severity for distances $\lt11.1m$. - This update streamlines the cross-validation of trips against shapes by adhering to the GTFS specification. This approach has resulted in minimal changes to validation outcomes as evidenced here. Any minor discrepancies arise from instances where the feed does not comply with the specification, often leading to
ERROR
level notices such asdecreasing_or_equal_stop_time_distance
anddecreasing_shape_distance
.- By capitalizing on the expectation that both stop-time and shape distances should incrementally increase, the validation process is optimized. Instead of evaluating all points, we now only assess the last one, which changes our processing time complexity from linear to constant for these elements.
Empirical Performance Comparison: Considering $n$ as the number of trips, $m$ as the number of stop-times, and $k$ as the number of shapes, the complexity in the worst-case scenario:
- For the
master
branch is $\Omega(k^2nm)$ - For the feature branch (
feat/1613
) is $\Omega(nk)$
Statistical Performance Comparison:
The performance improvements are depicted in the graph below. The datasets analyzed are from our catalog, with zipped file sizes of at least 1MB. Sizes have been normalized for a more meaningful comparison of slopes. The performance slope of the feature branch is significantly lower, decreasing from approximately 37 to approximately 25, indicating enhanced efficiency.
Please make sure these boxes are checked before submitting your pull request - thanks!
- [x] Run the unit tests with
gradle test
to make sure you didn't break anything - [ ] Add or update any needed documentation to the repo
- [x] Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
- [x] Linked all relevant issues
- [ ] Include screenshot(s) showing how this pull request works and fixes the issue(s)
✅ Rule acceptance tests passed. New Errors: 0 out of 1485 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. Dropped Errors: 1 out of 1485 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. New Warnings: 0 out of 1485 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. Dropped Warnings: 0 out of 1485 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. 1 out of 1486 sources (~0 %) are corrupted. Corrupted sources: us-district-of-columbia-dc-circulator-gtfs-486 Commit: 89d1cc12396fd51d6c90203de45a0f6b37cb43f9 Download the full acceptance test report here (report will disappear after 90 days). ✅ Rule acceptance tests passed.
❌ Invalid acceptance test. New Errors: 0 out of 1485 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. Dropped Errors: 17 out of 1485 datasets (~1%) are invalid due to code change, which is above the provided threshold of 1%. New Warnings: 25 out of 1485 datasets (~2%) are invalid due to code change, which is above the provided threshold of 1%. Dropped Warnings: 0 out of 1485 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. 1 out of 1486 sources (~0 %) are corrupted. Corrupted sources: fi-etela-pohjanmaa-komia-liikenne-gtfs-1255 Commit: bb18676a979283f07758357fc6153893006fbdca Download the full acceptance test report here (report will disappear after 90 days). ❌ Invalid acceptance test.
❌ Invalid acceptance test. New Errors: 0 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. Dropped Errors: 36 out of 1520 datasets (~2%) are invalid due to code change, which is above the provided threshold of 1%. New Warnings: 82 out of 1520 datasets (~5%) are invalid due to code change, which is above the provided threshold of 1%. Dropped Warnings: 0 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. 0 out of 1520 sources (~0 %) are corrupted. Commit: 97dc096354caf2a3bf6d4410fdb2e6bbde2852dc Download the full acceptance test report here (report will disappear after 90 days). ❌ Invalid acceptance test.
❌ Invalid acceptance test. New Errors: 0 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. Dropped Errors: 37 out of 1520 datasets (~2%) are invalid due to code change, which is above the provided threshold of 1%. New Warnings: 82 out of 1520 datasets (~5%) are invalid due to code change, which is above the provided threshold of 1%. Dropped Warnings: 0 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. 0 out of 1520 sources (~0 %) are corrupted. Commit: b9deab73a0da1f510bab4c1b607ccf1db53df434 Download the full acceptance test report here (report will disappear after 90 days). ❌ Invalid acceptance test.
❌ Invalid acceptance test. New Errors: 0 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. Dropped Errors: 37 out of 1520 datasets (~2%) are invalid due to code change, which is above the provided threshold of 1%. New Warnings: 91 out of 1520 datasets (~6%) are invalid due to code change, which is above the provided threshold of 1%. Dropped Warnings: 0 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. 0 out of 1520 sources (~0 %) are corrupted. Commit: cb6887baad13b5b2ae3d9ed8b6319f2cf62008f2 Download the full acceptance test report here (report will disappear after 90 days). ❌ Invalid acceptance test.
❌ Invalid acceptance test. New Errors: 0 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. Dropped Errors: 37 out of 1520 datasets (~2%) are invalid due to code change, which is above the provided threshold of 1%. New Warnings: 91 out of 1520 datasets (~6%) are invalid due to code change, which is above the provided threshold of 1%. Dropped Warnings: 0 out of 1520 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%. 0 out of 1520 sources (~0 %) are corrupted. Commit: ff5278a825039862ca4745393598146dd4294f3f Download the full acceptance test report here (report will disappear after 90 days). ❌ Invalid acceptance test.