test-infra
test-infra copied to clipboard
Support fuzzy string matching to compare failures
Use Jaro-Winkler string matching to compare failures. This helps the case where there are random generated string in the error, for example, https://github.com/pytorch/pytorch/pull/114697. For example,
jaroWinkler(
"/tmp/pip-install-1ffb916n/fbgemm-gpu_a232bb6f0fa24cea8b498f73f367969c/fbgemm_gpu/src/sparse_ops/sparse_ops_cpu.cpp:129:7: error: ‘optTypeMetaToScalarType’ was not declared in this scope; did you mean ‘c10::optTypeMetaToScalarType’?",
"/tmp/pip-install-g1l1attb/fbgemm-gpu_a8335f2b184946059273dcfd4193adee/fbgemm_gpu/src/sparse_ops/sparse_ops_cpu.cpp:129:7: error: ‘optTypeMetaToScalarType’ was not declared in this scope; did you mean ‘c10::optTypeMetaToScalarType’?"
) returns
0.8987928326805548
So I set the threshold to be 0.85, and try it out. A threshold of 1.0 is the same as ===
string comparison.
Testing
Failures on https://github.com/pytorch/pytorch/pull/114697 are correctly shown as flaky and broken trunk
curl --request POST \
--url "http://localhost:3000/api/drci/drci?prNumber=114697" \
--header "Authorization: TOKEN" \
--data 'repo=pytorch'
@huydhn is attempting to deploy a commit to the Meta Open Source Team on Vercel.
A member of the Team first needs to authorize it.
The latest updates on your projects. Learn more about Vercel for Git ↗︎
Name | Status | Preview | Comments | Updated (UTC) |
---|---|---|---|---|
torchci | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | Dec 2, 2023 2:09am |
Do you mind also check that the threshold is enough to prevent similar test names from being marked as the same? im also just generally interested in what would be counted as similar based on this
That's a fair point. Let me find more examples to support/against it.
One the other hand, this looks more flexible than the current way we compare failures, so I think I could set the threshold to 1.0 here if we're not entirely sure and tweak this value later.