opensourcecontributors icon indicating copy to clipboard operation
opensourcecontributors copied to clipboard

Provide data in json via very simple API, suitable for further analysis by the users

Open nealmcb opened this issue 7 years ago • 5 comments

As noted in #5, #6, #83, #86, etc, users are often interested in further analysis of the results of their searches.

One simple way to help some of them would be to provide a very simple API to opensourcecontributors to provide the search results data in json format. It would ideally be structured so as to make it easy for users, via the github api, and their own language of choice, to do further queries and analysis. A few examples would go a long way.

This would probably also result in code that was useful for implementing the feature requests listed above.

nealmcb avatar Jul 12 '17 02:07 nealmcb

There already is a simple API, actually. The front end uses it directly. No authentication is necessary. Would you be interested in documenting it?

These are the only endpoints: https://github.com/tenex/opensourcecontributors/blob/master/ghc-app/controller.go#L34

hut8 avatar Jul 13 '17 02:07 hut8

Hmm. I tried all those endpoints, and the /user... ones return html, not json. E.g. https://opensourcecontributo.rs/user/nealmcb

The others were just 404 for me. Am I missing something?

nealmcb avatar Jul 13 '17 05:07 nealmcb

Yeah, the missing piece of the puzzle is that nginx routes to the API only under the /api/ path. Check it out:

https://opensourcecontributo.rs/api/user/nealmcb https://opensourcecontributo.rs/api/user/nealmcb/events etc

joshjordan avatar Jul 13 '17 05:07 joshjordan

Indeed - thank you!

I hope to find time to come back to provide proper documentation, but here is an API usage example, in Python, for how to retrieve all events for a user, and hints on sorting them out:

import json
import urllib.request
import codecs

def getevents(userid):
    "Retrieve and return all event pages for given userid"

    reader = codecs.getreader("utf-8")

    events = []
    pagenum = 1
    while True:
        url = "https://opensourcecontributo.rs/api/user/{}/events/{}".format(userid, pagenum)
        page = json.load(reader(urllib.request.urlopen(url)))
        if page["size"] == 0:
            break
        events += page["events"]
        pagenum += 1

    return events

events = getevents("myuserid")  # put userid you want in here

events is now a dict, and the type field indicates whether it is an IssueCommentEvent, PushEvent, IssuesEvent, GollumEvent, CommitCommentEvent, etc.

nealmcb avatar Jul 13 '17 14:07 nealmcb

See also:

  • statistics: https://opensourcecontributo.rs/api/stats E.g. {"eventCount":684951818,"latestEvent":"2017-07-13T12:59:59Z","latestEventAge":5631}
  • ?? https://opensourcecontributo.rs/api/error
  • ?? https://opensourcecontributo.rs/api/aggregates

nealmcb avatar Jul 13 '17 14:07 nealmcb