changedetection.io icon indicating copy to clipboard operation
changedetection.io copied to clipboard

Change triggered/detected as Blank Diff on some sites when filter availability fluctuates

Open yenba opened this issue 2 years ago • 30 comments

Describe the bug Occasionally I will have a notification kick off saying that there was a "change" but the diff is blank and the files are identical.

Version v0.39.19.1 - Running in Docker Container on Ubuntu 22.04.1 LTS Server

To Reproduce I'm not sure how to reproduce the behavior as it seems inconsistent.

Share link https://changedetection.io/share/ym-I7IBLMW4a

Expected behavior The program to not trigger alerts if there are no changes in the diff.

Screenshots No changes are detected in this diff comparison however it still triggered a change and a notification.

image

Here are the actual files compared in VScode. Same thing, no difference between them.

Screen Shot 2022-09-20 at 2 43 54 PM

Desktop

  • OS: macOS
  • Browser: Chrome
  • Version: 105.0.5195.125

yenba avatar Sep 20 '22 18:09 yenba

I am experiencing the same issue and have been unable to reproduce it.

freddieleeman avatar Sep 20 '22 18:09 freddieleeman

I've seen it also, I would love to add a ENV flag to save each HTML that was downloaded to try isolate it, I have a feeling it might be something in the encoding changing or something.

there is this attempt https://github.com/dgtlmoon/changedetection.io/pull/925

but again, I cant be expected todo all the work here, would be awesomesauce if someone else would help a bit :(

dgtlmoon avatar Sep 20 '22 20:09 dgtlmoon

getting the same. i monitor lot of github tags pages and suddenly, sometimes one, sometimes multiple pages trigger blank changes. sometimes it triggers blank changes on other sites too. i have no proper way to reproduce it. the only way is to add a lot of my monitors to yours and wait. изображение

"div.Box-row:nth-child(1) > div:nth-child(1) > div:nth-child(1) > div:nth-child(1) > h4:nth-child(1) > a:nth-child(1)"

bykidi avatar Sep 25 '22 15:09 bykidi

@bykidi can you hit the 'share url' button and paste in the link it generates?

dgtlmoon avatar Sep 25 '22 16:09 dgtlmoon

@bykidi btw... that dark mode looks awesome, how did you do it?

dgtlmoon avatar Sep 25 '22 16:09 dgtlmoon

What i've noticed: using CSS selector (manual pick from firefox instead what is offered with built-in visual selector) reduces/eleminates false detections on github pages. but on some pages changedetection can't find my CSS selectors, which is why i'm forced to use the visual selector result. examples: wireguard download page: https://download.wireguard.com/windows-client/ "body > ul:nth-child(7) > li:nth-child(1) > a:nth-child(1)" results in: изображение k-lite download pages, full and update packages: https://codecguide.com/download_k-lite_codec_pack_mega.htm https://codecguide.com/klcp_update.htm ".tdcontent > h4:nth-child(6)" ".tdcontent > h4:nth-child(5)" results are: изображение

about those that sometimes trigger false changes... here is a bunch of watches, hope some of it can also trigger on your side https://changedetection.io/share/GBhNAeP6frca https://changedetection.io/share/dK34ZnckcSka https://changedetection.io/share/er2XApq63hQa https://changedetection.io/share/CBYplonj1Jga https://changedetection.io/share/4M-wjTz9Zswa https://changedetection.io/share/epV2uEgU4QUa https://changedetection.io/share/Uv534DVIV64a https://changedetection.io/share/v-L2Ft3LyZsa https://changedetection.io/share/swmRUdLLGJwa https://changedetection.io/share/RFM-bO2lb0ca https://changedetection.io/share/JYAoQc_nYZIa https://changedetection.io/share/OAbC9G2Y4gka https://changedetection.io/share/XPtjn5ICqvMa https://changedetection.io/share/_MxYVtnU60Ya this one triggered today

i use dark reader globally on all sites with exceptions изображение

bykidi avatar Sep 25 '22 16:09 bykidi

linking this issue with mine #908

bykidi avatar Sep 25 '22 17:09 bykidi

Morning false positives on a bunch of github tags pages: изображение изображение изображение https://changedetection.io/share/W52piCoIwz0a https://changedetection.io/share/95fCdVFLbO8a https://changedetection.io/share/95fCdVFLbO8a https://changedetection.io/share/NPTWII0MvH4a https://changedetection.io/share/NqUybbruJCMa

bykidi avatar Sep 26 '22 05:09 bykidi

@bykidi does it only happen with watches that use chrome? or does it happen for all types of requests?

dgtlmoon avatar Sep 26 '22 06:09 dgtlmoon

@dgtlmoon actually, i have no idea... all of my fetches use the latest version of chrome.

bykidi avatar Sep 26 '22 07:09 bykidi

@bykidi does it only happen with watches that use chrome? or does it happen for all types of requests?

For me, it happens in all types of requests, not just Chrome.

yenba avatar Sep 26 '22 11:09 yenba

I think I know - it's caused when you have a CSS/xPath filter applied, but the filter can not be found, then it is found again on the next check

I'm betting that your watches all have filters

dgtlmoon avatar Sep 27 '22 08:09 dgtlmoon

Ok so I don't know how this is fixable yet, because some people have the scenario that

  • They add a CSS filter for an element that doesn't yet exist, but SHOULD in the future (like a cinema ticket goes on sale .on-sale)
  • They want changedetection.io to keep checking and notify them when a change/filter was detected

There is a test to make sure this works https://github.com/dgtlmoon/changedetection.io/blob/3ebb2ab9ba593bea346c8ca20364f8690568170b/changedetectionio/tests/test_filter_exist_changes.py#L45

But here on this issue its like

  • Filter existed for a while
  • Something in the JS or Browser didnt work, so the page partly rendered but the filter was missing
  • Page rechecked, filter re-appeared
  • Notification was sent

dgtlmoon avatar Sep 27 '22 08:09 dgtlmoon

i got filters on everything because sometimes there is a lot of changes that won't fit into telegram's message symbol limit, which is why i check for versions (mostly software) and then add the full diff link to it.

bykidi avatar Sep 27 '22 10:09 bykidi

Ok so I don't know how this is fixable yet, because some people have the scenario that

  • They add a CSS filter for an element that doesn't yet exist, but SHOULD in the future (like a cinema ticket goes on sale .on-sale)

  • They want changedetection.io to keep checking and notify them when a change/filter was detected

There is a test to make sure this works https://github.com/dgtlmoon/changedetection.io/blob/3ebb2ab9ba593bea346c8ca20364f8690568170b/changedetectionio/tests/test_filter_exist_changes.py#L45

But here on this issue its like

  • Filter existed for a while

  • Something in the JS or Browser didnt work, so the page partly rendered but the filter was missing

  • Page rechecked, filter re-appeared

  • Notification was sent

Hmm. That does make sense. Thanks for taking a look at it!

Maybe instead of "fixing" it, there could be some kind of a workaround. Something like an option to not send notifications if the {diff} field is blank?

At least in my case that would solve the blank notifications!

yenba avatar Sep 27 '22 14:09 yenba

Can confirm. This early morning techpowerup site was down and my instance triggered 'filters not found 6 times' notify. Later then it triggered blank changes on all of the previous 'not found' watches. I think that we need 'only monitor for actual changes (ignore not found/found again)' option by default and those who monitor 'out of stock/back in stock' should specially use that option.

bykidi avatar Oct 01 '22 10:10 bykidi

изображение изображение

bykidi avatar Oct 01 '22 10:10 bykidi

I've been seeing this as well for the past few weeks. It will usually trigger multiple sites and push notifications even though there is no change/diff

adamrgolf avatar Oct 02 '22 03:10 adamrgolf

I was thinking of a smarter way to deal with this, maybe like a ratio number stored where the 0.0-1.0 tells of the success of the last 10(?) attempts

if the success ratio < 0.5 then we can send some alert/notification such as "Looks like the filter is sometimes not available and maybe sending false alerts, would you like to limit this watch to (insert solution here)"

dgtlmoon avatar Oct 04 '22 09:10 dgtlmoon

@bykidi

Can confirm. This early morning techpowerup site was down and my instance triggered 'filters not found 6 times' notify. Later then it triggered blank changes on all of the previous 'not found' watches.

Thanks for that - that was exactly what I was thinking was happening

dgtlmoon avatar Oct 04 '22 09:10 dgtlmoon

it is happening right now изображение i have paused those watches. weird stuff happens - i get red notification that my specific filter is not found, but i see that the page was rendered properly (proper page screenshot on the preview tab)

bykidi avatar Oct 04 '22 13:10 bykidi

There is indeed a change. Previously, i used this filter div.Box-row:nth-child(1) > div:nth-child(1) > div:nth-child(1) > div:nth-child(1) > h4:nth-child(1) > a:nth-child(1) but when there is 'no filter' - this exact field should be filtered like this div.Box-row:nth-child(1) > div:nth-child(1) > div:nth-child(1) > div:nth-child(1) > div:nth-child(1) > h2:nth-child(1) > a:nth-child(1)

bykidi avatar Oct 04 '22 14:10 bykidi

After testing with the browserless version specified in the installation guide - the problem is the same. On github, sometimes (like right now) i still get false positives. Mostly, on github. изображение

bykidi avatar Nov 07 '22 18:11 bykidi

Imo an 'ignore blank diff' option would work fine

NaruZosa avatar Nov 11 '22 06:11 NaruZosa

what we probably need is a 'stock/out of stock' mode which should ignore that and only monitor for actual changes...

bykidi avatar Nov 11 '22 13:11 bykidi

@bykidi so whats the solution in the case that the CSS filter doesnt exist any more because you set the CSS filter to point to something like .current-price ?, then .current-price disappears because it's sold out.. then comes back again.. you never knew

there is no easy answer

@NaruZosa but that would ignore when the filter was missing

dgtlmoon avatar Nov 11 '22 20:11 dgtlmoon

i mean, for those kind of monitors there may be additional checkbox. if its enabled - changedetection acts like it does right now. but if its isn't - it should ignore for missing then found again filters and only notify if its missing for more than %filter_failure_notification_threshold_attempts% times

bykidi avatar Nov 12 '22 10:11 bykidi

@bykidi yeha almost.. I think we're coming up with solutions based on assumptions tho, the solution should be

  • 1st and foremost - add logging to the checks, we shouldnt work on this without that ( #845 )
  • 2nd if the filter was not found, send a link 'hey filter wasnt found, do you want me to consider this a change still? or only notify you when the filter IS present AND the content is different'

dgtlmoon avatar Nov 12 '22 10:11 dgtlmoon

Imagine we were looking at this issue from the very start, and we have that log available, where we could have seen what was going on

dgtlmoon avatar Nov 12 '22 10:11 dgtlmoon

btw https://github.com/dgtlmoon/changedetection.io/blob/f86763dc7a27ca71bf432da3ec31a827f35b1648/changedetectionio/tests/test_filter_exist_changes.py#L44

dgtlmoon avatar Dec 06 '22 11:12 dgtlmoon