scholar.py icon indicating copy to clipboard operation
scholar.py copied to clipboard

List papers citing a paper

Open ckreibich opened this issue 10 years ago • 12 comments

Xavi Anguera has suggested making the list of papers citing a paper queryable via the API. This needs a bit more thinking about the notion of paper identity (cluster ID) vs presentation to the user, but shouldn't be a big problem otherwise.

ckreibich avatar Feb 17 '14 17:02 ckreibich

I would love to see this, too. I need to analyze the set of citations to a particular article, and it is very painful to do this manually, screenful by screenful.

rpgoldman avatar May 07 '14 15:05 rpgoldman

Me too, this would make scholar.py an incredibly powerful tool!

On 07 May 2014, at 17:00, rpgoldman [email protected] wrote:

I would love to see this, too. I need to analyze the set of citations to a particular article, and it is very painful to do this manually, screenful by screenful.

— Reply to this email directly or view it on GitHub.

jooweg avatar May 07 '14 15:05 jooweg

:+1: Thing is how do we counter the API query limits? I sometimes wish Google provided a free access to it's repository, just like arXiv. Is setting up tor a good idea?

arcolife avatar May 07 '14 16:05 arcolife

I'm willing to live with reasonable throttling. E.g., 250 articles (I just pulled by hand) is a horrible nuisance by me, but probably in the noise for Google, especially if I do it once in a blue moon.

Archit Sharma wrote:

:+1: Thing is how do we counter the API query limits? I sometimes wish Google provided a free access to it's repository, just like arXiv. Is setting up tor a good idea?

— Reply to this email directly or view it on GitHub https://github.com/ckreibich/scholar.py/issues/5#issuecomment-42450888.

rpgoldman avatar May 07 '14 16:05 rpgoldman

Duly noted, folks! Support for this is on the way.

@arcolife, Tor will help you little regarding query limits; in fact, given that it's easy to identify Tor exits it might actually make things worse for you. The only real help will be distributed clients, but you'll have to build that botnet yourself. :)

ckreibich avatar May 07 '14 16:05 ckreibich

@ckreibich I see! :shit:

Btw I've been building a recommendation engine for Research papers. It would be nice to have this feature, as it would add on to the currently available sources. I'm willing to contribute! :)

arcolife avatar May 07 '14 17:05 arcolife

I can take on this issue. Need a little guidance and help with the existing though though. If I am understanding the problem:

You can get the link to the citing papers by accessing the url_citations attribute. Eg:

querier.articles[0]attrs.get('url_citations')[0]

should return something like u'http://scholar.google.com/scholar?cites=5556531000720111691&as_sdt=2005&sciodt=0,5&hl=en'

And since we have a new search result, the goal is to parse this page into individual articles?

chendaniely avatar Nov 03 '15 01:11 chendaniely

If it helps: this is how I stumbled on this issue: I was writing an article, and wanted to claim that the literature on topic X did not contain any article that addressed issue I.

Google Scholar had the right information to do this, but it was very painful to extract that information. I had to scroll through pages and pages of articles, moving from page to page interactively. And there was no way to check this claim for correctness over time. I.e., if I reran the query, I had no obvious way to check to see if the results were the same, or if new papers had appeared.

I was hoping to be able to automate this process at least somewhat.

rpgoldman avatar Nov 03 '15 16:11 rpgoldman

I can put up an ipython notebook with a working example of my extension module that implements this. @rpgoldman I'm pretty much in the same boat as you. Once I finish up implementing my extension I can see what would be the best way to fold the code in.

#10 seems to address the problem, but it's not merged, and I'm not 100% sure if it's doing what I want at the moment.

chendaniely avatar Nov 04 '15 19:11 chendaniely

Hi guys, are there any news on this issue? Im about to implement the same, @chendaniely did you implement something on this?

marianormuro avatar Aug 22 '16 17:08 marianormuro

hey @marianormuro sorry for the really late reply. My implementation is really hacky, janky, and untested. Probably shouldn't really use it for 'serious' work

chendaniely avatar Sep 26 '16 22:09 chendaniely

#83 seems to have solved the discussed issue.

pesho-ivanov avatar Mar 25 '19 02:03 pesho-ivanov