ebayKleinanzeigenAlert icon indicating copy to clipboard operation
ebayKleinanzeigenAlert copied to clipboard

"webpage fetching error for url"

Open Kamran12646 opened this issue 1 year ago • 27 comments

Hello dear community, Hello dear vinc3po, Since 5 p.m. I've been getting the following error with the ebAlert bot: "webpage fetching error for url:" Has the HTML structure of Kleinanzeigen really been changed again in this short time or is it due to my Raspberry Pi 4?

Hallo liebe Community, Hallo lieber vinc3po, seit 17 Uhr bekomme ich beim ebAlert Bot den folgenden Fehler: "webpage fetching error for url:" Wurde die HTML-Struktur von Kleinanzeigen in dieser kurzen Zeit nochmals abgeändert oder liegt es an meinem Raspberry Pi 4?

Kamran12646 avatar Nov 20 '23 20:11 Kamran12646

Same for me @Win10 since today.

makedamnsure avatar Nov 21 '23 00:11 makedamnsure

Same issue here :) running locally on windows, so it shouldn't be your raspberry pi.

Pelicana avatar Nov 21 '23 20:11 Pelicana

Did anyone find a solution yet?

Kamran12646 avatar Nov 22 '23 14:11 Kamran12646

nope, would love to know too.

Pelicana avatar Nov 22 '23 20:11 Pelicana

Same issue here @ Intel NUC (Proxmox)

DanielZ3108 avatar Nov 24 '23 20:11 DanielZ3108

same for me on MacOS..

henengel avatar Nov 25 '23 20:11 henengel

Here is the hidden message: 'In deinem IP-Bereich kam es vor Kurzem mehrfach zu unsicheren Versuchen, unsere Plattform zu verwenden. Dies kann auch durch andere Personen erfolgt sein. Daher wurde dieser IP-Bereich zur Vorbeugung von Betrug zeitweilig von der Nutzung von Kleinanzeigen ausgeschlossen. Bitte versuche es später erneut.' It seems that they are implementing some sort of anti-scalping security. I'll investigate if this can be bypassed somehow.

I'll keep you updated

vinc3PO avatar Nov 26 '23 14:11 vinc3PO

Gibt's was neues ?

Araibona avatar Dec 11 '23 13:12 Araibona

leider noch nichts

max49944 avatar Dec 11 '23 15:12 max49944

I've tried to change the User Agent Header, but unfortunately it doesn't was a solution..

alafad avatar Dec 29 '23 12:12 alafad

I don't think there will be a solution anymore, unfortunately I think it's been put on ice. Unfortunately no one cares about it

max49944 avatar Dec 30 '23 14:12 max49944

Best advice is to fetch less frequently and random.

workinghard avatar Dec 30 '23 14:12 workinghard

I don't think there will be a solution anymore, unfortunately I think it's been put on ice. Unfortunately no one cares about it

I think there is not really a bug, it should be working again with other headers and in case a proxied requests by high frequently usage.

alafad avatar Dec 30 '23 17:12 alafad

Indeed, the problem is not the Header. If you try with the same ip address with similar header in Postman it goes through. However, with python requests is does not. I have a old version that still works, but the new ones does not. I tried downgrading to the version that works without success. I have been quite busy to fully investigate the error. My next try is to try without the requests library. As I believe that might be the problem, that ebay recognise it as a python as assume that is a bot.

To be continued...

vinc3PO avatar Dec 30 '23 18:12 vinc3PO

Hey man, did you found any solution to bypass the detection? In the past i made my own monitor which worked, I wanted to reactivate it today but only 403 and the same message.

I personally use discord for notifications and a simple proxy function to monitor multiple urls at once maybe this is a good functions for your one too.

Would share my code but it is really shitty because I’m not a good coder 😄

svenisda avatar Feb 01 '24 22:02 svenisda

I would also appreciate a solution, is there any way to help?

Zippochonda avatar Feb 02 '24 09:02 Zippochonda

I would also appreciate a solution, is there any way to help?

I will try today other scraping methods instead of bs4 if I find one I will reply. Maybe we all can connect an built a super Kleinanzeigen monitor 😁

svenisda avatar Feb 02 '24 09:02 svenisda

Good news everyone,

it seems that if you update the latest requests and urllib it would work. requests=2.31.0 urllib3=2.2.0

Not sure how long it will take ebay to shut it down again. But for the time being it should work

pip install requests urllib3 --update

Let me know if that works for you as well.

vinc3PO avatar Feb 02 '24 18:02 vinc3PO

For me it is still not working after the update of requests and urllib3. I was also experimenting with different headers, but no solution so far.

beleza-pura avatar Feb 03 '24 08:02 beleza-pura

I've been investigating a little. Looks like they introduced Akamai Bot Protection. Mainly there are two things keeping the bot from working properly:

⚠️ FingerprintJS detected: 
https://static.kleinanzeigen.de/static/js/top.yt20r2l2bahn.js

⚠️ Akamai detected: 
https://www.kleinanzeigen.de/akam/13/435259e7

More information about this bot protection you can find here. My quick workaround was to use the trial version of ZenRows API service to bypass anti-bot protection. But since in the future I will have to pay for it, I'll try to figure something out on my own.

beleza-pura avatar Feb 03 '24 12:02 beleza-pura

Found a very good workaround using playwright and Chromium. I can use it as a headless browser so it is working in the background and you don't have to pay any Akamai solving stuff.

I have some contacts which selling Akamai, px and other apis but why paying when playwright works 😊

svenisda avatar Feb 03 '24 12:02 svenisda

it seems that if you update the latest requests and urllib it would work.

For me this fixes the problem at my working machine (Win10 Home) but unfortunately not at my thin client (Win10 IoT LSCT 21H2).

makedamnsure avatar Feb 03 '24 16:02 makedamnsure

Good news everyone,

it seems that if you update the latest requests and urllib it would work. requests=2.31.0 urllib3=2.2.0

Not sure how long it will take ebay to shut it down again. But for the time being it should work

pip install requests urllib3 --update

Let me know if that works for you as well.

It doesn't work for me (RPi3), i used pip install --upgrade requests urllib3 to update. But i get still the webfetching error

 sudo python -m ebAlert links -a https://www.kleinanzeigen.de/s-dreibaum/k0
>> Adding url
<< webpage fetching error for url: https://www.kleinanzeigen.de/s-dreibaum/k0
<< Link and post added to the database

Zippochonda avatar Feb 06 '24 14:02 Zippochonda

It worked with this command on a Proxmox Container:

It doesn't work for me (RPi3), I used pip install --upgrade requests urllib3 to update. But i get still the webfetching error

Bot is currently working again

Thanks!

DanielZ3108 avatar Feb 06 '24 15:02 DanielZ3108

Bad News Everyone!!!

As noted by @beleza-pura it seems that ebay is starting a fight against bots.

I've been investigating a little. Looks like they introduced Akamai Bot Protection. Mainly there are two things keeping the bot from working properly:

⚠️ FingerprintJS detected: 
https://static.kleinanzeigen.de/static/js/top.yt20r2l2bahn.js

⚠️ Akamai detected: 
https://www.kleinanzeigen.de/akam/13/435259e7

More information about this bot protection you can find here. My quick workaround was to use the trial version of ZenRows API service to bypass anti-bot protection. But since in the future I will have to pay for it, I'll try to figure something out on my own.

In this case, the problem is not the Akamai bot protection the problem as the requests library can't perform those Javascript challenges. However, this means they are actively trying to stop us from using bots. It seems that they have invested money and have tools that analysis traffic, learn from it and block what they suspect is a bot. That means that it is a matter of time before the new updated library get blacklisted and blocked again.

If you start using selenium or other library using virtual browser, then the Akamai bot will start learning your bot behaviour and eventually block it.

What does it mean?

It means that this simple bot will be soon archived. To counter the akamai bot a much larger project must be undertaken where the browser activity have to be randomized to act like human.

Thank you all!

vinc3PO avatar Feb 08 '24 07:02 vinc3PO

This currently works. But as you mentioned it might be countered in the future. https://github.com/vinc3PO/ebayKleinanzeigenAlert/pull/41

yamanatoo avatar Feb 08 '24 10:02 yamanatoo

Since yesterday 12:20 i got the "webpage fetching error for url" errror. I tried #41 but it doesn't work. @yamanatoo does your fix still work for you?

This are the error message:

Starting Ebay alert Processing link - id: 1 - link: https://www.kleinanzeigen.de/s-test/k0 2024-06-08 21:07:10,903 - get_session in ebAlert.crud.base - ERROR - Message: session not created: Chrome failed to start: exited normally. (session not created: DevToolsActivePort file doesn't exist) (The process started from chrome location /snap/chromium/2873/usr/lib/chromium-browser/chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.) Stacktrace: #0 0x55f7bce8e63a #1 0x55f7bcb8f65c #2 0x55f7bcbc3c95 #3 0x55f7bcbbff8f #4 0x55f7bcc099a4 #5 0x55f7bcbfd313 #6 0x55f7bcbcd586 #7 0x55f7bcbcdefe #8 0x55f7bce57b7f #9 0x55f7bce5bd0a #10 0x55f7bce459dc #11 0x55f7bce5c491 #12 0x55f7bce2b7ee #13 0x55f7bce7dc28 #14 0x55f7bce7de36 #15 0x55f7bce8d6f1 #16 0x7f174c46eac3

ERROR:ebAlert.crud.base:Message: session not created: Chrome failed to start: exited normally. (session not created: DevToolsActivePort file doesn't exist) (The process started from chrome location /snap/chromium/2873/usr/lib/chromium-browser/chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.) Stacktrace: #0 0x55f7bce8e63a #1 0x55f7bcb8f65c #2 0x55f7bcbc3c95 #3 0x55f7bcbbff8f #4 0x55f7bcc099a4 #5 0x55f7bcbfd313 #6 0x55f7bcbcd586 #7 0x55f7bcbcdefe #8 0x55f7bce57b7f #9 0x55f7bce5bd0a #10 0x55f7bce459dc #11 0x55f7bce5c491 #12 0x55f7bce2b7ee #13 0x55f7bce7dc28 #14 0x55f7bce7de36 #15 0x55f7bce8d6f1 #16 0x7f174c46eac3

<< Ebay alert finished

tchleb avatar Jun 08 '24 19:06 tchleb