go icon indicating copy to clipboard operation
go copied to clipboard

x/pkgsite: API for pkg.go.dev

Open rhcarvalho opened this issue 4 years ago • 49 comments

Prior to pkg.go.dev, godoc.org has had a JSON API that can be used to, among other things, discover importers of a given package.

Example: https://api.godoc.org/importers/golang.org/x/net/html

Given that pkg.go.dev does a much better job at tracking importers thanks to Go Modules and the Module Proxy, it would be nice if the community could get access to a public API similar to that of godoc.org.

rhcarvalho avatar Jan 26 '20 20:01 rhcarvalho

@rhcarvalho - it would be helpful to get a sense of what your current use cases are for api.godoc.org, and feature requests are for an API for pkg.go.dev.

It sounds like getting the importers for a package is one of them. With pkg.go.dev being module aware, what specific information about importers would be useful to surface via an API?

For example:

  • importers for a specific version of a package
  • only importers of the latest version of a package
  • any importer for all versions of a package
  • something else?

Additionally, what other information would be useful to you to surface via an API?

/cc @tbpg who has also mentioned wanting an API for pkg.go.dev

julieqiu avatar Jan 29 '20 15:01 julieqiu

Without much thought, having an API to answer the more specific question "what are the importers of a specific version of a package" would make plausible to derive the answer to the other items in your list.

At the moment I consume the godoc.org API and scrape data from pkg.go.dev to answer the question "who uses my package". As far as I can tell the data in the "Imported By" go.dev tab is unrelated to the version of the package I'm currently browsing.

Here are the other endpoints in the GoDoc API: https://github.com/golang/gddo/blob/7365cb292b8bfd13dfe514a18b207b9cc70e6ceb/gddo-server/main.go#L901-L904

  • /search: takes a q parameter. Can be useful to explore known packages/modules.
  • /packages: attempts to return all known packages, no pagination. Doesn't seem too usable as is.
  • /importers/: useful, hard to compute without central knowledge.
  • /imports/: dispensable, easy to compute offline with go list.

So if we need a more specific request, here it is:

api.go.dev MVP

  • /importers/{module-or-package}/@v/<version>: returns the list of importers of a given module/package at a given version. URL scheme could be tuned to match the Go Proxy specification (go help goproxy).
  • /search/: matches the functionality of https://pkg.go.dev/search?q=hello, returns JSON instead.

rhcarvalho avatar Jan 29 '20 16:01 rhcarvalho

Another endpoint idea (standalone or as part of a set of info returned for a module): available versions of a given module.

tbpg avatar Mar 18 '20 17:03 tbpg

@tbpg

Another endpoint idea (standalone or as part of a set of info returned for a module): available versions of a given module.

Shouldn't people use cmd/go for that? Not least because of the fall-through semantics of the go env GOPROXY` variable. Noting https://github.com/golang/go/issues/37367

myitcv avatar Mar 18 '20 18:03 myitcv

What information beyond https://proxy.golang.org/<module>/@v/list would be provided by that endpoint?

julieqiu avatar Mar 18 '20 18:03 julieqiu

Shouldn't people use cmd/go for that? Not least because of the fall-through semantics of the go env GOPROXY` variable. Noting #37367

There are some cases where cmd/go might not be available and a normal web API would be helpful. I think it's reasonable to assume pkg.go.dev will only return the modules/versions it knows about.

What information beyond https://proxy.golang.org/<module>/@v/list would be provided by that endpoint?

None. That works. :)

tbpg avatar Mar 18 '20 18:03 tbpg

There are some cases where cmd/go might not be available and a normal web API would be helpful

I'd be wary of encouraging tool authors to query pkg.go.dev/a proxy directly. Because it could well introduce skew compared to the answers from cmd/go.

myitcv avatar Mar 18 '20 18:03 myitcv

In my use-case I don't want to execute cmd/go binaries as I just want to grab some metadata (i.e. detected license of all dependencies) from a dozen modules and organize it on a webpage. If I need local clones of those modules it really complicates this internal tool.

adamdecaf avatar Mar 18 '20 18:03 adamdecaf

In https://github.com/golang/go/issues/37952 I raised the question of whether module/package license file information could be surfaced in the output of cmd/go list. Following a conversation on yesterday's golang-tools call, we concluded that doing so would be a bad idea; reasons summarised in https://github.com/golang/go/issues/37952#issuecomment-611527845.

@ianthehat instead suggested exposing the information via a pkg.go.dev API, leveraging the fact that the content and presentation on pkg.go.dev has already jumped through the relevant legal hoops.

This comment is therefore to explicitly request that we include license file information in the API. Thank you!

myitcv avatar Apr 09 '20 13:04 myitcv

👋 Thanks for all the new Go tools like pkg.go.dev, they're super useful. Some input on this topic from the perspective of Libraries.io:

  • +1 to a /search/ endpoint, or maybe just adding pagination to this url so it's possible to get a list of all packages: https://index.golang.org/index?since=TIME&limit=LIMIT. Many popular package repositories offer a full listing like this, e.g. https://packagist.org/packages/list.json for Packagist

  • What information beyond https://proxy.golang.org//@v/list would be provided by that endpoint?

Currently to get the publish times for all versions, you have to make N+1 requests: one for @v/list and then @v/xxx.info for each version and grab the "Time" value for each one. Assuming it's difficult to change the Go proxy spec, an API endpoint that returns all versions with metadata would be really nice.

  • As far as I can tell the data in the "Imported By" go.dev tab is unrelated to the version of the package I'm currently browsing.

Anyone know if there are plans to fix this?

tiegz avatar Jun 05 '20 20:06 tiegz

  • As far as I can tell the data in the "Imported By" go.dev tab is unrelated to the version of the package I'm currently browsing.

Anyone know if there are plans to fix this?

No immediate plans. We currently gather that information from import statements in the code, so there is no version information attached. The go.mod file doesn't have all the version information we need.

So we understand it's an approximation and we want to fix it, but it's going to take some time.

jba avatar Jun 08 '20 12:06 jba

@julieqiu my use for an API would be to:

  1. search and list packages
  2. for a package, access all the information as available for a given package in each of the tabs, such as here https://pkg.go.dev/github.com/gin-gonic/gin?tab=licenses ... basically everything that is available as HTML in the frontend https://github.com/golang/pkgsite/tree/master/internal/frontend should be available in a JSON api

pombredanne avatar Jul 12 '20 13:07 pombredanne

@julieqiu oh and please do not retire http://api.godoc.org/packages unless there is an alternative!

pombredanne avatar Jul 12 '20 13:07 pombredanne

it would be helpful to get a sense of what your current use cases are for api.godoc.org, and feature requests are for an API for pkg.go.dev.

As a downstream package maintainer for Fedora, I'm also interested in an API. We have our own tool, Anitya, to track package releases, but it was not designed to track GIT commits. And many Go packages still don't publish version. So any information about latest published commit, with info like date of the commit, new dependencies, would be very helpful. I would gather data from the API in Python and compare it to the latest version we have for a given package.

I'm also interested in getting the License info, so we could find recursively all the licenses used in a static binary.

eclipseo avatar Aug 29 '20 21:08 eclipseo

and BTW my general use case is for https://github.com/nexB/scancode-toolkit and related projects to provide my users with license, origin and dependencies details. And for https://github.com/nexb/vulnerablecode where we provide vulnerabilities details.

pombredanne avatar Aug 30 '20 07:08 pombredanne

My present use case is discovering major version upgrades. Currently, we have to scrape pkg.go.dev. An alternative to pkg.go.dev API, would be if there were an "official" way (via go list for example) in the future to retrieve the latest major version of a module, we could use it in tools like go-mod-upgrade and icholy/gomajor but that would depend on golang/go#40357 (deprecated notification) and golang/go#40323 (notify about newer major versions). Thanks!

StevenACoffman avatar Sep 21 '20 14:09 StevenACoffman

@StevenACoffman I've updated gomajor to find newer versions using the module proxy. The performance is acceptable.

icholy avatar Oct 05 '20 05:10 icholy

My use case is importing modules/packages to another package manager (Guix) where builds are completely reproducible and monolithic vendoring is strongly discouraged. Having a JSON API would simplify the obtention of metadata like licenses, brief package descriptions, dependencies and hashes.

0x2b3bfa0 avatar Oct 17 '20 12:10 0x2b3bfa0

I didnt see it mentioned yet, I found this link to be helpful:

https://github.com/golang/gddo/wiki/API

I hope similar API will be available for new site, or at least maintain the existing API.

89z avatar Dec 27 '20 01:12 89z

We'll be keeping api.godoc.org around for a while.

jba avatar Dec 28 '20 20:12 jba

@jba with all due respect, that comment is pretty vague. I am actively using that API, so if it goes down, Im just going to end up cURLing this:

https://pkg.go.dev/search?q=Query

and scraping the HTML, as I suspect others will be too. So any further detail would be appreciated.

89z avatar Dec 28 '20 20:12 89z

Our current plan is to keep api.godoc.org around until we have a suitable replacement.

jba avatar Jan 05 '21 13:01 jba

/search/: matches the functionality of https://pkg.go.dev/search?q=hello, returns JSON instead.

Building on this point, it would be great if a real-time fuzzy search API were exposed. That way command line tools could wrap/use this.

myitcv avatar Feb 05 '21 06:02 myitcv

Hello everyone, is there a public API for pkg.go.dev ?, if there is I want to use it for my package library i.e. gopack-cli, for now I'm using the go doc API, for downloading go package library from gopack-cli .

restuwahyu13 avatar Jul 22 '21 08:07 restuwahyu13

@restuwahyu13 , keep using api.godoc.org.

jba avatar Jul 22 '21 12:07 jba

But is api.godoc.org the same as pkg.go.dev sir @jba ?

restuwahyu13 avatar Jul 22 '21 15:07 restuwahyu13

@restuwahyu13 pkg.go.dev doesn't have an API yet, that's why this issue is open and not closed.

fzipp avatar Jul 22 '21 15:07 fzipp

Stopping by to ask that any first version of an API provide some means to glean the source code / repo location of a package that uses a vanity path/url/name.

shellscape avatar Oct 07 '21 03:10 shellscape

I have similar use cases as others, so I resorted to scraping the UI as well (:nauseated_face:), and put it behind an API/CLI: https://github.com/guseggert/pkggodev-client

guseggert avatar Oct 20 '21 15:10 guseggert

The /packages api endpoint, https://api.godoc.org/packages, is returning {"error":{"message":"Internal Server Error"}} .

Has this endpoint been migrated?

sig-aarena avatar Jan 21 '22 19:01 sig-aarena

Seems like most api.godoc.org endpoints are erroring, currently?

e.g. https://api.godoc.org/packages, https://api.godoc.org/imports/github.com/goburrow/cache, etc.

I'm particularly interested in the imports endpoint. Has this moved somewhere else or is there a mirror I can use?

tills13 avatar Jan 28 '22 18:01 tills13

This is starting to be really problematic. @jba could someone kindly look into this? This is really looking bad

pombredanne avatar Jan 28 '22 23:01 pombredanne

@julieqiu do you know what happened?

pombredanne avatar Jan 28 '22 23:01 pombredanne

Will get on this today.

jba avatar Jan 31 '22 13:01 jba

@jba you rock... that's very kind of you! :bow:

pombredanne avatar Jan 31 '22 14:01 pombredanne

Note that only alternative today would be to resort to extensive pseudo-random screen scraping of https://pkg.go.dev/search laced in with some extensive recursive hitting of https://index.golang.org/index to get a complete package picture. It can work if this is what is recommended, but it would feel a bit like it is 1999 all over again. ;) If anything and if that's the way to go, then at least I could create a common service that does this for everyone?

pombredanne avatar Jan 31 '22 14:01 pombredanne

api.godoc.org should be back up.

jba avatar Feb 01 '22 12:02 jba

@jba this is responding but not doing anything for me:

$ wget https://api.godoc.org/packages
--2022-02-01 14:33:56--  https://api.godoc.org/packages
Resolving api.godoc.org (api.godoc.org)... 142.250.179.147, 2a00:1450:400e:80f::2013
Connecting to api.godoc.org (api.godoc.org)|142.250.179.147|:443... connected.
HTTP request sent, awaiting response...

this stays there hanging waiting, no I/O nor CPU used.

pombredanne avatar Feb 01 '22 13:02 pombredanne

@jba strike this out. It completed after a while! Thank you ++ :bow:

pombredanne avatar Feb 01 '22 13:02 pombredanne

Prior to pkg.go.dev, godoc.org has had a JSON API that can be used to, among other things, discover importers of a given package.

Sorry for the stupid question but where are the documentations for using this "JSON API" that godoc.org had?

ZiViZiViZ avatar Feb 24 '22 15:02 ZiViZiViZ

@ZiViZiViZ https://github.com/golang/gddo/wiki/API

srenatus avatar Feb 25 '22 08:02 srenatus

Thanks but this does not seem to have enough information. I am looking to be able to get the license for a package. I was hoping there is a way to do something like that:

 $ curl -s https://api.godoc.org/search?q=gopkg.in/[email protected] |jq '.results[].license'
"Apache License 2.0"

ZiViZiViZ avatar Feb 25 '22 16:02 ZiViZiViZ

@ZiViZiViZ the way I accomplish this is by making a request to https://pkg.go.dev/gopkg.in/yaml.v2?tab=licenses and using an HTML parser to grab the license headers. (replace gopkg.in/yaml.v2 with whichever package you're looking for the license for)

tills13 avatar Feb 25 '22 16:02 tills13

Thank you. That is what I am doing right now but I thought there might be a better way to do this.

ZiViZiViZ avatar Feb 28 '22 12:02 ZiViZiViZ

I am part of the Software Heritage initiative to archive all versions of public code and packages.

I'm looking for a way of listing all go packages, planning then to download the source for each release. For example in the Rust ecosystem, crates.io has an API that allows us to get all the metadata for a known package and each of its releases, which is very practical.

It appears that there only exists https://api.godoc.org/packages which returns a single 300MB+ JSON file containing only URLs to the repositories of packages (and some import_count metric that is always 0), but nothing more. https://index.golang.org/index is not looking any different. Am I missing anything worthwhile that would help?

Thanks in advance.

Alphare avatar Mar 03 '22 14:03 Alphare

please keep this issue on topic and refer questions to https://github.com/golang/go/wiki/Questions

seankhliao avatar Mar 03 '22 17:03 seankhliao

Rather than hiding what I'd consider a lot of really useful comments by folks (because I landed here for many of the same reasons they did, and there is extremely little available with regard to go package APIs) it would be infinitely more useful to explain to folks how to use the wiki for questions - or specify that tangential and related questions here are not welcome, and folks should seek the answers on the wiki. As it is now, that curt reply seems to imply that we should post questions on the wiki itself, which I doubt is what you're actually after.

The go team has been kicking this can for years, so users are naturally going to have a lot of questions. This reply https://github.com/golang/go/issues/36785#issuecomment-579820081 specifically asks users what features they're after in an API - thus the issue is fair game for feature-related questions. Go packages are the very last major ecosystem package registry to lack a comprehensive API and that which doesn't make its data readily available. Because of this, people are naturally going to have a myriad of questions about trying to access that data. I'd humbly ask @seankhliao to be a bit more gracious in your moderation of this issue.

shellscape avatar May 07 '22 20:05 shellscape

  • Use case: generating static documentation, or running a local documentation server, but wanting to be able to query (with javascript, at the time of the page view by a web browser) the "[Imported by: NNN]" number that appears if instead viewed on pkg.go.dev.

  • currently api.godoc.org/importers is returning {"error":{"message":"Internal Server Error"}}

golightlyb avatar Jun 29 '22 02:06 golightlyb

For what it's worth, we have been running api.godoc.org as an unmonitored service (meaning we don't page people for it), approximately "best effort" although even that may be too generous. We discovered today that it has been down for 30+ days because the disk filled. Given that being down for a month was a non-issue, we've decided to leave it down.

The godoc.org redirects will continue to run (approximately forever), but api.godoc.org will no longer be accessible. Or rather it will continue to be inaccessible.

rsc avatar Aug 08 '22 21:08 rsc

@rsc fair enough if this is the decision but the previous comment is mine, right at the start of that 30+ day period, saying it's down, and three people agreeing.

If this is the decision, again, fine - but I submit that it shouldn't be based on people not complaining because to be fair I raised this pretty much as soon as it happened and it was never fixed and didn't have a response (fair enough- priorities) so anyone saying the same thing wouldn't be adding anything, and nobody knew they had a 30ish day deadline to object

(I don't intend for this to sound harsh, just as a disagreement! 🙂)

Edit: as an aside that might be helpful for people with an interest in this, Google's experimental Open Source Insights has an API. I wasn't aware of it until recently but it could very well resolve this entire conversation.

golightlyb avatar Aug 08 '22 21:08 golightlyb

honestly this reflects pretty poorly on the entire team and the project in general. compared to other ecosystems and their package management, and user experience, the go community leaves a lot to be desired.

shellscape avatar Aug 08 '22 22:08 shellscape

Nice. Comment on the actions taken, or lack thereof, and get hidden as off-topic. If the team isn't open to polite and honest criticism, lock the topic.

shellscape avatar Aug 08 '22 22:08 shellscape

I use the Dash doc viewer on my Mac which i find super useful for building a local index of all sorts of docs, Go included - Just found i couldn't install a new Go docset into it, presumably as it can no longer build an index from api.godoc.org.

Would be nice to point the author towards an alternate index if this one isn't returning as the usefulness of Dash is greatly diminished for me without having rapid access to the docs of Go packages i use frequently.

gwatts avatar Aug 09 '22 02:08 gwatts

Dash 6.3.1 now relies on the web scraping. It works, but API would be better.

AlekSi avatar Aug 15 '22 04:08 AlekSi