context_attentive_ir
context_attentive_ir copied to clipboard
Preprocessing AOL ClickURLs
Hello,
I am wondering how you can get the titles of the documents from AOL ClickURLs. In particular, AOL data provides only the domains of the actual click URLs, not the full URLs of the documents that the users click based on their search queries. For instance, if the user searches "aircraft carrier" and then clicks the following Wiki document: https://en.wikipedia.org/wiki/Aircraft_carrier, we can see only its domain: https://en.wikipedia.org/. In this regard, I am wondering you only use the title of the Wikipedia main page (i.e., Wikipedia, the free encyclopedia), for this example.
Thank you so much!