arxiv-sanity-preserver icon indicating copy to clipboard operation
arxiv-sanity-preserver copied to clipboard

Feature Request: Expanding to all of arXiv

Open mkanwal opened this issue 8 years ago • 10 comments
trafficstars

How feasible would it be to expand to all categories in arXiv?

Per #33, you mention that it's important to keep communities small so that "top papers" are still relevant. Couldn't this still be maintained by having a user specify as part of their account which subcategories they work in? And then top papers for a user would do some sort of cross-category normalization to account for multiple communities of different sizes. Maybe we could also crowdsource clustering of categories into different research areas and have those preset (like it has been done for ML currently).

Would love to see this platform become widely adopted!

mkanwal avatar Dec 26 '16 16:12 mkanwal

Moreover, any chance for real time fetching from arXiv? It seems it takes a day or two for a paper to appear on site.

Thank You.

RoyiAvital avatar Jan 17 '17 07:01 RoyiAvital

Afair papers are mostly released in bulk by arxiv once per day, so downloading more often than that isn't really necessary.

Moredread avatar Jan 18 '17 01:01 Moredread

Take this one for instance:

https://arxiv.org/abs/1701.04018 (CS.cv)

Published few days ago. Still no on arXiv Sanity - http://www.arxiv-sanity.com/1701.04018.

Anyway to make the fetching part more robust?

RoyiAvital avatar Jan 18 '17 06:01 RoyiAvital

@Moredread @RoyiAvital this is because in current terrible state I have to manually ssh into the box that runs arxiv sanity and run an authentication script & enter password, or my credentials expire after ~3 days. And sometime I forget. I can't find a way to automate this right now, but I'm working on switching AS to longer-term solution anyway.

karpathy avatar Jan 18 '17 06:01 karpathy

@karpathy , I see.

Well, you do wonders with this site, so don't see it as a complain :-).

Thank You.

RoyiAvital avatar Jan 18 '17 06:01 RoyiAvital

@karpathy can you just do something like that?

0 * * * * . /opt/deep_arxiv/config.sh; python3 /opt/deep_arxiv/scripts/arxiv_paper_fetch.py >> /opt/deep_arxiv/crontab.log

BenderV avatar Feb 19 '17 16:02 BenderV

Would people find it useful to have arxiv-sanity also keep track of older papers (from before the project was started)? The value would be that one could add such papers to their library to better tailor their recommendations.

Not sure how far back arxiv-sanity currently goes...

mkanwal avatar May 24 '17 21:05 mkanwal

👋🏽 Hey friends - my friends and I built filtr.pub as a fun side project to address some of the missing gaps in Arxiv Sanity Preserver. We gather data from everything in CS and Stat (along with things like citation counts from Google Scholar, papers with code links, etc.), and also have additional functionality like search queries, custom date ranges, following custom keywords, etc. We have daily jobs that sync the latest data :)

If you're interested - please check it out! It's still really early stages but we'd love some feedback :)

jeetmehta avatar Jan 05 '21 15:01 jeetmehta

@jeetmehta please stop the SPAM... thanks...

alxhotel avatar Jan 05 '21 16:01 alxhotel

Will do - sorry! :)

jeetmehta avatar Jan 05 '21 16:01 jeetmehta