[Feature Request]: AI-based Fallback for Detecting Flaky Visual Diffs

Open Anxhul10 opened this issue 7 months ago • 0 comments

Summary

Loki currently performs pixel-by-pixel visual diffs, which can result in flaky test failures due to minor rendering inconsistencies (like anti-aliasing, sub-pixel shifts, font rendering differences, etc.).
This proposal suggests adding an optional AI-based fallback diffing step that only runs when a diff is detected, to reduce false positives and improve CI trust.

Motivation

Many teams using Loki in CI pipelines face false positives that aren’t meaningful UI regressions.
Commercial tools like Applitools use AI/ML techniques to classify visual diffs more accurately.
By integrating a lightweight AI-based diff check as a fallback, Loki can reduce flaky test noise while maintaining speed.

Proposed Solution

Default behavior remains unchanged: Loki uses pixel-by-pixel comparisons.
If a visual diff is detected, Loki conditionally runs a secondary comparison using an AI-based image analysis module.
If this module determines that the change is non-significantthe test is either:
- Marked as flaky (with a warning), or
- Passed automatically (based on config).
Configuration options:
- Enable/disable AI fallback.
- Thresholds for AI-based similarity.
- Option to log warnings or fail softly.

Benefits

Reduces flaky test failures from minor, non-regressive changes.
Improves CI confidence and developer experience.
Keeps runtime performance high by only triggering AI checks on detected diffs.
Provides flexibility via opt-in config or plugin support.

Looking forward to your thoughts! 🙌

May 25 '25 17:05 Anxhul10