context_attentive_ir icon indicating copy to clipboard operation
context_attentive_ir copied to clipboard

Preprocessing AOL ClickURLs

Open JinheonBaek opened this issue 10 months ago • 0 comments

Hello,

I am wondering how you can get the titles of the documents from AOL ClickURLs. In particular, AOL data provides only the domains of the actual click URLs, not the full URLs of the documents that the users click based on their search queries. For instance, if the user searches "aircraft carrier" and then clicks the following Wiki document: https://en.wikipedia.org/wiki/Aircraft_carrier, we can see only its domain: https://en.wikipedia.org/. In this regard, I am wondering you only use the title of the Wikipedia main page (i.e., Wikipedia, the free encyclopedia), for this example.

Thank you so much!

JinheonBaek avatar Aug 07 '23 22:08 JinheonBaek