immo icon indicating copy to clipboard operation
immo copied to clipboard

Immobilienscout24: Expose IDs Are Repeating

Open danielxul opened this issue 5 years ago • 5 comments

The bot is not finding new flats for my search anymore because all new flats are using expose IDs of already contacted flats. Interesting is that different landlords share the same expose IDs sequentially.

danielxul avatar May 12 '20 07:05 danielxul

Could you tell me on which website is this happening? If possible, could you show some output from the script here? If different landlords are sharing the same expose IDs, then we must find some other information which distinguish them.

nickirk avatar May 20 '20 10:05 nickirk

Oh, sorry. Didn't see on the title it's on immobilienscout24. Yeah, please share some output from the script.

nickirk avatar May 20 '20 13:05 nickirk

I'm running into this as well. I have 2 theories on why this might occur:

  • we might have some problem in the file handling as persistence mechanism between the different runs. I noticed that the files href.json and href-old.json are never closed like sent_request.dat and diff.dat. However, in that case I would expect bigger problems (like corrupt files or at least non-valid json).
  • we don't know for sure whether immoscout is consistent in the offers it responds. Perhaps once in a while one of the offers is missing (while it was available before and so we've submitted a response to it), and in the next run is back again. This is because only the latest response is stored in href.json.

I think the second situation is more likely and should be resolved anyways. I propose to store the union of both the href.json and the href-old.json file at the end of each loop and make that our new href.json file. In the next loop href-old.json is then updated to that full list. This way we make sure the list ever growing.

@nickirk do you see a way how we can store the union of these files?

korthout avatar Jun 09 '20 18:06 korthout

Today I ran some tests, I got different expose id's. What do you mean exactly by the same id's? See the output:

26 new offers found
New offers id:  ['/expose/119787876', '/expose/119719458', '/expose/119493996', '/expose/93101990', '/expose/119429397', '/expose/119676346', '/expose/119448562', '/expose/119809351', '/expose/119731006', '/expose/119800263', '/expose/103060160', '/expose/119410642', '/expose/119034010', '/expose/113920625', '/expose/119834437', '/expose/119603125', '/expose/119619886', '/expose/76217718', '/expose/119079931', '/expose/97514229', '/expose/73071019', '/expose/119724887', '/expose/119806294', '/expose/117552158', '/expose/119366422', '/expose/119640645']
Time:  2020-06-20 12:57:58.732992
Sending message to:  /expose/119787876

yes sometimes the landlords keep removing their offers and then put them back online. The script handles this by recording the expose id's of all the offers that it has sent a message to in the "diff.dat" file. If this file is not removed by the user, then in principle you should not run into the problem of sending repeated message to the same offer (I mean the same expose id). One thing I can imagine is that if the landlord removes his/her offer and puts it back online again, but immoscout24 assigns to the offer a new expose id, then the script will still sends a request to this new id. Please provide detailed output for me to debug. I don't quite understand the situation here.

nickirk avatar Jun 20 '20 11:06 nickirk

Ah, I see. I designed the blacklist.href file to handle this issue back then, but it was a long time ago. I sort of forgot. Yes, we can just put all the expose id's which we have sent messages to into the balcklist.ref, and ensure that we never send messages again to them. But I also thought that if the landlords put their offers back online again, it means that they didn't find a suitable candidate, so it is reasonable to send requests to them again. So I didn't put this feature into action. But you can use the information from the diff.dat file to avoid sending offers to the same id's. Let me know if you have any problem in doing that.

nickirk avatar Jun 20 '20 11:06 nickirk