arxiv-sanity-lite icon indicating copy to clipboard operation
arxiv-sanity-lite copied to clipboard

papers.labml.ai

Open hnipun opened this issue 3 years ago • 10 comments

Hi @karpathy,

We built papers.labml.ai in May (introductory tweet) to discover research papers based on popularity on Twitter. We were using arxiv-sanity to discover papers and I started this as a side project inspired by it (partly because it was down from time to time).

We worked on it on and off since May and have added a bunch of features, such as:

  • Popular papers based on Tweets
  • Link source codes, annotated implementations, videos, Reddit and Hackernews discussions, and other resources related to the paper
  • Conferences (iclr 2022, neurips 2021)
  • Short two-line summaries of the papers to quickly browse through lists of papers
  • Similar papers based on language model embeddings

And we are working on something very similar to tags on sanity-lite (which we call lists).

We love to hear your feedback and suggestions. Thanks for releasing your work.

Screenshot 2021-11-14 at 10 24 45 Screenshot 2021-11-14 at 10 25 36 Screenshot 2021-11-14 at 10 27 01

hnipun avatar Nov 14 '21 05:11 hnipun

Hi Nipun, it's fun to hear from you! I actually have https://papers.labml.ai/papers/weekly pinned in my toolbar and visit it regularly, great work on the site and I look forward to seeing where you take it!

karpathy avatar Nov 14 '21 05:11 karpathy

@karpathy, Happy to hear you are finding it useful. Mostly, we make improvements based on our personal needs. Let us know if you have any suggestions for improvements. Thanks.

hnipun avatar Nov 15 '21 02:11 hnipun

@karpathy @hnipun could you guys add a feature , where it filters out papers from arxiv that have "github" repo links.

For eg : If I search CLIP , it shows only the papers that have a github repo link (in the comments , abstract or under code & data)

That way only the papers with github code gets displayed on the screen (for folks who are looking for papers having implementation ready)

GeorvityLabs avatar Dec 07 '21 23:12 GeorvityLabs

@GeorvityLabs This would potentially require downloading the full text of the paper, dramatically increasing the complexity. Currently we can afford to only scrape the abstracts and this is very helpful. So I don't believe this is easy sadly.

karpathy avatar Dec 08 '21 06:12 karpathy

@karpathy that makes sense.

In case of some papers . the authors include github repo links in their abstract, so scrapping just the abstract alone would work in those cases.

But in most other papers, the github repo links are usually included under : Code & Data, Comments or the Abstract section (on arxiv.org). So, if we manage to scrape these three sections separately it would be possible to implement the feature.

GeorvityLabs avatar Dec 08 '21 09:12 GeorvityLabs

@karpathy

I tried using the davinci-codex engine , to generate a python script , just for fun to see what codex can do :


import webbrowser

#get input string from user
input_string = input("Enter a string: ")

#search arvix.org for that input string
search_string = "https://arxiv.org/search/?query=" + input_string + "&searchtype=all&source=header"

#filter search such that only papers with code is displayed
search_string = search_string + "&filter=has-official-code:y"

#print the search string
print(search_string)

#open the search string in a new tab
webbrowser.open_new_tab(search_string)

I was wondering if there is any &filter options that enables us to check the Code & Data , Abstract and Comments separately.

In the above code , I don't think the &filter=has-official-code:y" is doing anything much. But, it would be awesome if we could have such filter options.

GeorvityLabs avatar Dec 08 '21 09:12 GeorvityLabs

I think github repo can be obtained from papeswithcode.com. I just added a github_links field to papers table

subramanya1997 avatar Feb 14 '22 10:02 subramanya1997

@subramanya1997 not all papers have code up on paperswithcode.com , usually people link their github repos more often than via paperswithcode.com.

GeorvityLabs avatar Feb 14 '22 10:02 GeorvityLabs

@GeorvityLabs True. Some of them won't be mentioned in the paper but would be released later. I though it is better to have some than have none.

subramanya1997 avatar Feb 14 '22 11:02 subramanya1997

@subramanya1997 true. But, usually after been reading both paperwithcode and arvix for years, one thing i noticed is , usually most github links don't make it to paperswithcode. but like you said , something is better.

GeorvityLabs avatar Feb 14 '22 11:02 GeorvityLabs