rss-bridge icon indicating copy to clipboard operation
rss-bridge copied to clipboard

LeBonCoin failed with error 403

Open buloto24 opened this issue 4 years ago • 29 comments

Error message: The requested resource cannot be found! Please make sure your input parameters are correct! cUrl error: (0) PHP error: Query string: action=display&bridge=LeBonCoin&keywords=nintendo+sitch&region=13&department=&cities=&category=&pricemin=&pricemax=&estate=&roomsmin=&roomsmax=&squaremin=&squaremax=&mileagemin=&mileagemax=&yearmin=&yearmax=&cubiccapacitymin=&cubiccapacitymax=&fuel=&owner=&format=Html Version: git.master.2714c3d

buloto24 avatar Oct 25 '20 15:10 buloto24

Quick investigation: region changed to locations. I tested to update the request (and a minimal request) but it seems the refusal come from somewhere else.

JackNUMBER avatar Oct 26 '20 19:10 JackNUMBER

Hi,

403 since 2 days despite it worked great before

[Mon Mar 01 11:58:32.768278 2021] [proxy_fcgi:error] [pid 495:tid <redacted>] [client <redacted>] AH01071: Got error 'PHP message: Exception: Invalid parameters value(s): keywords_, estat_e, y_earmax in /<redacted>/rss-bridge/lib/error.php:24\nStack trace:\n#0 /<redacted>/rss-bridge/lib/error.php(33): returnError('Invalid paramet...', 400)\n#1 /<redacted>/rss-bridge/lib/BridgeAbstract.php(229): returnClientError('Invalid paramet...')\n#2 /<redacted>/rss-bridge/actions/DisplayAction.php(133): BridgeAbstract->setDatas(Array)\n#3 /<redacted>/rss-bridge/index.php(38): DisplayAction->execute()\n#4 {main}'

Thanks

floviolleau avatar Mar 01 '21 11:03 floviolleau

@floviolleau

I've got a 403 suddenly.. Changing the user agents (~ line 362 ) works for me.. after 10 call, not working anymore for me..

timat35 avatar Apr 05 '21 08:04 timat35

Ping @JackNUMBER as maintainer of this bridge.

em92 avatar Apr 05 '21 09:04 em92

@em92 Didn't manage to fix 403 at the moment. @timat35 can you provide a PR?

JackNUMBER avatar Apr 05 '21 18:04 JackNUMBER

Is there a fix for this great bridge?

As for me I also get a 403 error with a different message:

Error message: `Unexpected response from upstream.
cUrl error:  (0)
PHP error: Creating default object from empty value`
Query string: `action=display&bridge=LeBonCoin&keywords=thinkpad&region=2&department=&cities=&category=&pricemin=&pricemax=1400&estate=&roomsmin=&roomsmax=&squaremin=&squaremax=&mileagemin=&mileagemax=&yearmin=&yearmax=&cubiccapacitymin=&cubiccapacitymax=&fuel=&owner=&format=Atom`
Version: `dev.2020-11-10`

hista avatar Apr 14 '21 15:04 hista

@hista have you tried to change the user agent in the bridge it is currently working for me

timat35 avatar Apr 14 '21 15:04 timat35

Hi @timat35, I thought it didn't fix for long, because of what you wrote earlier:

Changing the user agents (~ line 362 ) works for me.. after 10 call, not working anymore for me..

Does it mean it lasts longer than 10 calls now?

My current user-agent in the bridge is User-Agent: LBC;Android;10;SAMSUNG;phone;0aaaaaaaaaaaaaaa;wifi;8.24.3.8;152437;0 How should I change it?

hista avatar Apr 14 '21 16:04 hista

well, for me I just delete some aaaaa.. like LBC;Android;10;SAMSUNG;phone;0aaaaaa;wifi;8.24.3.8;152437;0

datadome is monitoring user agent, so we need to have different one (this is my theory, I'm not sure why it is working here though)

timat35 avatar Apr 14 '21 20:04 timat35

Thanks @timat35 it works for now, I hope it will last for long, this LeBonCoin bridge is awesome when it works :-)

hista avatar Apr 14 '21 20:04 hista

Hi guys, is LeBonCoin bridge still perfectly working for you? As for me since yesterday, I sometimes get the same 403 error I mentioned earlier (https://github.com/RSS-Bridge/rss-bridge/issues/1820#issuecomment-819601471) and moreover my alerts are now very slow, a long time after the CACHE_TIMEOUT setting.

hista avatar May 20 '21 14:05 hista

403 mainly come from LBC bot protection. My serveur's IP has been blocked and I take 403 since months. When I change the UserAgent I have 2-3 requests in 200 before come back to 403.

JackNUMBER avatar May 20 '21 23:05 JackNUMBER

Do we have a way to fix that ? Such as using Google Bot user agent ? Proxying through TOR ?

lapineige avatar Aug 25 '21 07:08 lapineige

Hello, i got a way to fix 403. Not free though, you can find my contact infos on my profile

pointpaul avatar Aug 25 '21 13:08 pointpaul

Oh nice, we've got scammers now :thinking: (I was sure you would be a bot :smile:)

lapineige avatar Aug 25 '21 13:08 lapineige

Scraping all Leboncoin daily but scammer yes, lol

pointpaul avatar Aug 25 '21 13:08 pointpaul

I hope deploying docker image in the cloud will solve this for me. It's more complex than upload file on a server but it can be automated too.

JackNUMBER avatar Aug 25 '21 14:08 JackNUMBER

@pointpaul vade retro to https://www.growthhacking.fr/ ... The idea of this repo is to SHARE the code, not sell...

timat35 avatar Aug 25 '21 14:08 timat35

GL sharing datadome bypass for free then!

pointpaul avatar Aug 25 '21 14:08 pointpaul

My serveur's IP has been blocked and I take 403 since months. When I change the UserAgent I have 2-3 requests in 200 before come back to 403.

@JackNUMBER, have you tried to make list of user agents and randomly use them on request? Here is some list of old user agents (created 6 years ago) https://gist.github.com/pzb/b4b6f57144aea7827ae4

em92 avatar Sep 11 '21 08:09 em92

@JackNUMBER, have you tried to make list of user agents and randomly use them on request? Here is some list of old user agents (created 6 years ago) https://gist.github.com/pzb/b4b6f57144aea7827ae4

@em92 Just tried a new time it and still have 403 each time. EDIT: same on an other server.

JackNUMBER avatar Sep 11 '21 20:09 JackNUMBER

@JackNUMBER Did you try to bypass the bot detection with https://github.com/MoterHaker/bypass-captcha-examples/blob/main/geo.captcha-delivery.com.js ?

hista avatar Sep 11 '21 22:09 hista

@hista I'm quite sure there is a way to bypass datadome without exploit poor people..

timat35 avatar Sep 11 '21 22:09 timat35

@JackNUMBER Did you try to bypass the bot detection with https://github.com/MoterHaker/bypass-captcha-examples/blob/main/geo.captcha-delivery.com.js ?

Thank you @hista RSS-Bridge is a PHP project.

JackNUMBER avatar Sep 11 '21 22:09 JackNUMBER

Hey, I am trying to understand how datadome block requests.

For now, I have the following. When browsing leboncoin using a browser several requests are made, among them, I've looked closer to requests made to :

  • dd.leboncoin.fr (dd stands for datadome)
  • api.leboncoin.fr

The request to dd.leboncoin.fr looks like this :

POST /js/ HTTP/2
Host: dd.leboncoin.fr
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://www.leboncoin.fr/
Content-Type: application/x-www-form-urlencoded
Content-Length: 3972
Origin: https://www.leboncoin.fr
Dnt: 1
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-site
Sec-Gpc: 1

jsData=<bigpayload>&events=<some events made the user like mouse move, key up, each event has a timestamp...>&eventCounters=<count of the number of events by type of event>

So it looks like that this request sends to datadome informations about how the user interact with the website (its mouse movements, etc). The data response to that request is :

{"status":200,"cookie":"datadome=mWVbdmoClFIyL2o2GK-ezox3P47-smtjN19A4ricR5tuHe~PhrnNjgilN_4y2dqd1bB-TYCkvoSyaH3U4ksZ8s_.uPSdVKyzef2xhjzbqNvVMR7bd6OnJgNqp_ZDGBx; Max-Age=31536000; Domain=.leboncoin.fr; Path=/; Secure; SameSite=Lax"}

We see that datadome gave us a cookie. I wonder if it uses that cookie to follow us on the website and to keep track of real users and block the others.

If I remove the payload (jsData & co), the reponse is status 400, without cookie.

The other request is the one that queries the API :

POST /finder/search HTTP/2
Host: api.leboncoin.fr
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://www.leboncoin.fr/
Api_key: ba0c2dad52b3ec
Content-Type: application/json
Origin: https://www.leboncoin.fr
Content-Length: 198
Dnt: 1
Cookie: utag_main=_st:1648151365624$v_id:017fbd5e72ba005272a3c812e4e400044001900900bd0$_sn:1$_ss:1$_pn:1%3Bexp-session$ses_id:1648149557946%3Bexp-session; __Secure-InstanceId=e6f84352-df87-4cc3-8cd5-0f93410a3373;include_in_experiment=false
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-site
Sec-Gpc: 1

{"owner_type":"all","limit":35,"limit_alu":3,"sort_by":"relevance","sort_order":"desc","filters":{"enums":{"ad_type":["offer"]},"keywords":{"text":"ya"}},"listing_source":"direct-search","offset":0}

There you have the research filters as payload. From here, I'm not sure of what's happening. I'm not sure how the API decide if I'm blocked or not, and I can't really experiment because the request always works for now. My guess is that, maybe setting the Cookie : datadome=blabla help to not being blocked. Or maybe, you have to send good enough user interaction through the request to dd.leboncoin.fr to make your IP "valid' for a certain amount of time.

As datadome seems to be based on IA, the behavior may evolve and there are potentially multiple way to bypass it.

If someone wants to give it a try, I've made a little script in python to try these requests : https://github.com/Skealz/reqlbc/blob/main/req_lbc.py It would be great if blocked people try it and tell if it works

Skealz avatar Mar 25 '22 20:03 Skealz

I changed the endpoint in my PHP project to point to https://api.leboncoin.fr/finder/search instead of https://api.leboncoin.fr/api/adfinder/v1/search. I still get 403.

I checked the data given by these 403 responses, they contain : {"url":"https://geo.captcha-delivery.com/captcha/?initialCid=AHrlqAAAAAMAfeednTqgQfoAWKt4JA==&cid=Zm8ddoCRZ_odYhn8CsQpzejqgYQAgtoZJtN4rMvsBVABBuJmiXJG~hrqH~BZiiV1kQ1ZIpB7fUwia6fSwREUG3KY0677oKtMTV~nmd-MOfwHEKhbc~U9HWMbXUUIzW5&referer=https%3A%2F%2Fapi.leboncoin.fr%2Ffinder%2Fsearch&hash=05B30BD9055986BD2EE8F5A199D973&t=fe&s=7501"} Meaning that this is bot blocking mechanism.

This is surprising to me because from the same IP, I am able to issue request (from python) to the same API endpoint, without being blocked. Maybe datadome is able to identify something very specific about the request (timing, formatting... i don't know) to block it.

I will try to integrate into the PHP code, a request to dd.leboncoin.fr before the request to the API, to see if something happens.

Skealz avatar Apr 01 '22 14:04 Skealz

@Skealz Thanks!

I've got error 403 though (with the python code) after 2-3 days.. But that's a good beginning, I'll try with random payload to datadome.. Also, I'll try to make it works with PHP when I've got time (who knows when), thanks again

timat35 avatar Apr 13 '22 07:04 timat35

Hi timat35 Could you find some time since your last message? :-)

hista avatar Sep 14 '22 14:09 hista

It works when using a VPN. I chose the Netherlands on Browsec VPN for Firefox. No problems with the app on my phone.

mariolog avatar Aug 17 '23 18:08 mariolog