SmokeDetector
SmokeDetector copied to clipboard
Less fp's
- Excludes StackOverflow, Maths, Mathoverflow and Cross Validated from the "post is mostly images" reason (since StackOverflow can have
<img>
code counted and the other 3 have a lot of MathJax used) - Adds Cross Validated to the exclusion list for the "mostly punctuation marks in {}" reason (MathJax)
Statistics:
Excluding StackOverflow, Maths, Mathoverflow and Cross Validated from the "post is mostly images" reason, will result in:
- 31 fewer fp's
- 0 fewer tp's
The current accuracy of this reason is 17% (17) New accuracy: 40% (40)
Excluding Cross Validated from the "mostly punctuation marks in {}" reason, will result in:
- 0 fewer tp's (all tp's caught by other reasons)
- 30 fewer fp's
Note: The failures are not because of my code but because of how the tests are set up.
Edit: Fixed now
Less fp for the mostly-img reason would be great, but I don't think excluding sites is optimal approach. See https://github.com/Charcoal-SE/SmokeDetector/pull/4190
Less fp for the mostly-img reason would be great, but I don't think excluding sites is optimal approach. See #4190
MathJax would still get caught though
Excludes StackOverflow, Maths, Mathoverflow and Cross Validated from the "post is mostly images" reason (since StackOverflow can have
code counted and the other 3 have a lot of MathJax used)
Isn't there a stripcodeblocks
option that would help on Stack Overflow? As for the math sites...as far as I can tell, MathJax doesn't render as images, it just embeds the text in a <span class="math-container">
(which is rendered by client-side JS).
Excludes StackOverflow, Maths, Mathoverflow and Cross Validated from the "post is mostly images" reason (since StackOverflow can have code counted and the other 3 have a lot of MathJax used) Isn't there a stripcodeblocks option that would help on Stack Overflow? As for the math sites...as far as I can tell, MathJax doesn't render as images, it just embeds the text in a (which is rendered by client-side JS).
I have updated the PR and it has nothing to do with SO now, also the issue is MathJax posted as images and not actual MathJax.