loki
loki copied to clipboard
[Feature Request]: AI-based Fallback for Detecting Flaky Visual Diffs
Summary
Loki currently performs pixel-by-pixel visual diffs, which can result in flaky test failures due to minor rendering inconsistencies (like anti-aliasing, sub-pixel shifts, font rendering differences, etc.).
This proposal suggests adding an optional AI-based fallback diffing step that only runs when a diff is detected, to reduce false positives and improve CI trust.
Motivation
- Many teams using Loki in CI pipelines face false positives that arenβt meaningful UI regressions.
- Commercial tools like Applitools use AI/ML techniques to classify visual diffs more accurately.
- By integrating a lightweight AI-based diff check as a fallback, Loki can reduce flaky test noise while maintaining speed.
Proposed Solution
- Default behavior remains unchanged: Loki uses pixel-by-pixel comparisons.
- If a visual diff is detected, Loki conditionally runs a secondary comparison using an AI-based image analysis module.
- If this module determines that the change is non-significantthe test is either:
- Marked as flaky (with a warning), or
- Passed automatically (based on config).
- Configuration options:
- Enable/disable AI fallback.
- Thresholds for AI-based similarity.
- Option to log warnings or fail softly.
Benefits
- Reduces flaky test failures from minor, non-regressive changes.
- Improves CI confidence and developer experience.
- Keeps runtime performance high by only triggering AI checks on detected diffs.
- Provides flexibility via opt-in config or plugin support.
Looking forward to your thoughts! π