Immobilienscout24: Expose IDs Are Repeating
The bot is not finding new flats for my search anymore because all new flats are using expose IDs of already contacted flats. Interesting is that different landlords share the same expose IDs sequentially.
Could you tell me on which website is this happening? If possible, could you show some output from the script here? If different landlords are sharing the same expose IDs, then we must find some other information which distinguish them.
Oh, sorry. Didn't see on the title it's on immobilienscout24. Yeah, please share some output from the script.
I'm running into this as well. I have 2 theories on why this might occur:
- we might have some problem in the file handling as persistence mechanism between the different runs. I noticed that the files
href.jsonandhref-old.jsonare never closed likesent_request.datanddiff.dat. However, in that case I would expect bigger problems (like corrupt files or at least non-valid json). - we don't know for sure whether immoscout is consistent in the offers it responds. Perhaps once in a while one of the offers is missing (while it was available before and so we've submitted a response to it), and in the next run is back again. This is because only the latest response is stored in
href.json.
I think the second situation is more likely and should be resolved anyways. I propose to store the union of both the href.json and the href-old.json file at the end of each loop and make that our new href.json file. In the next loop href-old.json is then updated to that full list. This way we make sure the list ever growing.
@nickirk do you see a way how we can store the union of these files?
Today I ran some tests, I got different expose id's. What do you mean exactly by the same id's? See the output:
26 new offers found
New offers id: ['/expose/119787876', '/expose/119719458', '/expose/119493996', '/expose/93101990', '/expose/119429397', '/expose/119676346', '/expose/119448562', '/expose/119809351', '/expose/119731006', '/expose/119800263', '/expose/103060160', '/expose/119410642', '/expose/119034010', '/expose/113920625', '/expose/119834437', '/expose/119603125', '/expose/119619886', '/expose/76217718', '/expose/119079931', '/expose/97514229', '/expose/73071019', '/expose/119724887', '/expose/119806294', '/expose/117552158', '/expose/119366422', '/expose/119640645']
Time: 2020-06-20 12:57:58.732992
Sending message to: /expose/119787876
yes sometimes the landlords keep removing their offers and then put them back online. The script handles this by recording the expose id's of all the offers that it has sent a message to in the "diff.dat" file. If this file is not removed by the user, then in principle you should not run into the problem of sending repeated message to the same offer (I mean the same expose id). One thing I can imagine is that if the landlord removes his/her offer and puts it back online again, but immoscout24 assigns to the offer a new expose id, then the script will still sends a request to this new id. Please provide detailed output for me to debug. I don't quite understand the situation here.
Ah, I see. I designed the blacklist.href file to handle this issue back then, but it was a long time ago. I sort of forgot. Yes, we can just put all the expose id's which we have sent messages to into the balcklist.ref, and ensure that we never send messages again to them. But I also thought that if the landlords put their offers back online again, it means that they didn't find a suitable candidate, so it is reasonable to send requests to them again. So I didn't put this feature into action. But you can use the information from the diff.dat file to avoid sending offers to the same id's. Let me know if you have any problem in doing that.