github-api icon indicating copy to clipboard operation
github-api copied to clipboard

NullPointerException when trying to read file-content because encoding is not set

Open centic9 opened this issue 4 years ago • 4 comments

Describe the bug The test-case below triggers a NullPointerException because since some versions github-api expects "encoding" to be set in GHContent always, however it seems sometimes this is not set for some repositories.

To Reproduce Steps to reproduce the behavior:

  1. Run the unit-test below
  2. It fails with NullPointerException
    @Test
    public void testNullPointerException() throws IOException {
        GitHub github = GitHub.connect();

        final PagedSearchIterable<GHContent> list = github.searchContent().
                repo("Savyonify/cryptApiSet/").
                filename("build.gradle").list();

        for(GHContent match : list) {
            System.out.println("Reading " + match.getHtmlUrl());
            try (final InputStream stream = match.read()) {
                assertNotNull(stream);
            }
        }
    }

Expected behavior I would expect to still be able to read those files.

Desktop (please complete the following information):

  • OS: Linux
  • Browser N/A
  • Version 1.95, 1.107, 1.108

Additional context It seems github now returns some files as part of search-results which are not existing any more, maybe we can handle these more gracefully?

centic9 avatar Mar 07 '20 09:03 centic9

@centic9 I think this might be another expression of #751 - encoding returning null sounds like a similar problem.

bitwiseman avatar Mar 31 '20 15:03 bitwiseman

@centic9 Could you try this again with v1.109?

bitwiseman avatar Apr 01 '20 21:04 bitwiseman

Tried with 109, unfortunately the test-case posted above still fails with the same NPE

Caused by: java.lang.NullPointerException at org.kohsuke.github.GHContent.read(GHContent.java:175) at GitHubSupportTest.testNullPointerException(GitHubSupportTest.java:85)

centic9 avatar Apr 02 '20 07:04 centic9

@centic9 Thanks for this report, the test, and for trying it again.

Found the problem. It is due to the # characters in your folder/file names.

The search your example runs is this URL:

https://api.github.com/search/code?q=repo%3ASavyonify%2FcryptApiSet%2F+filename%3Abuild.gradle

The first item it returns starts this way:

{
  "total_count": 2,
  "incomplete_results": false,
  "items": [
    {
      "name": "build.gradle",
      "path": "BC Projects/#6 Tron Java/crypto/build.gradle",
      "sha": "74fd8cc18c4e3c84bd69d8b3bc1a618a5d743bf1",
      "url": "https://api.github.com/repositories/223952556/contents/BC%20Projects/#6%20Tron%20Java/crypto/build.gradle?ref=7b9fc9f06caef84ac8c970b578b67982cb852ddf",
      "git_url": "https://api.github.com/repositories/223952556/git/blobs/74fd8cc18c4e3c84bd69d8b3bc1a618a5d743bf1",
      "html_url": "https://github.com/Savyonify/cryptApiSet/blob/7b9fc9f06caef84ac8c970b578b67982cb852ddf/BC%20Projects/#6%20Tron%20Java/crypto/build.gradle",
      "repository": {
        "id": 223952556,
        "node_id": "MDEwOlJlcG9zaXRvcnkyMjM5NTI1NTY=",
        "name": "cryptApiSet",
        "full_name": "Savyonify/cryptApiSet",
        "private": false,
        "owner": {

If you get this same item separate from the the Search API (from listing the contents of folders) you get this:

{
  "name": "build.gradle",
  "path": "BC Projects/#6 Tron Java/crypto/build.gradle",
  "sha": "74fd8cc18c4e3c84bd69d8b3bc1a618a5d743bf1",
  "size": 323,
  "url": "https://api.github.com/repos/Savyonify/cryptApiSet/contents/BC%20Projects/%236%20Tron%20Java/crypto/build.gradle?ref=7b9fc9f06caef84ac8c970b578b67982cb852ddf",
  "html_url": "https://github.com/Savyonify/cryptApiSet/blob/7b9fc9f06caef84ac8c970b578b67982cb852ddf/BC%20Projects/%236%20Tron%20Java/crypto/build.gradle",
  "git_url": "https://api.github.com/repos/Savyonify/cryptApiSet/git/blobs/74fd8cc18c4e3c84bd69d8b3bc1a618a5d743bf1",
  "download_url": "https://raw.githubusercontent.com/Savyonify/cryptApiSet/7b9fc9f06caef84ac8c970b578b67982cb852ddf/BC%20Projects/%236%20Tron%20Java/crypto/build.gradle",
  "type": "file",
  "content": "cGx1Z2lucyB7CiAgICBpZCAnamF2YScKfQoKdmVyc2lvbiAnMS4wLjAnCgpz\nb3VyY2VDb21wYXRpYmlsaXR5ID0gMS44CgpyZXBvc2l0b3JpZXMgewogICAg\nbWF2ZW5DZW50cmFsKCkKfQoKZGVwZW5kZW5jaWVzIHsKICAgIHRlc3RDb21w\naWxlIGdyb3VwOiAnanVuaXQnLCBuYW1lOiAnanVuaXQnLCB2ZXJzaW9uOiAn\nNC4xMicKICAgIGNvbXBpbGUgImNvbS5tYWRnYWcuc3Bvbmd5Y2FzdGxlOmNv\ncmU6MS41OC4wLjAiCiAgICBjb21waWxlICJjb20ubWFkZ2FnLnNwb25neWNh\nc3RsZTpwcm92OjEuNTguMC4wIgogICAgY29tcGlsZSBwcm9qZWN0KCI6Y29t\nbW9uIikKfQo=\n",
  "encoding": "base64",
  "_links": {
    "self": "https://api.github.com/repos/Savyonify/cryptApiSet/contents/BC%20Projects/%236%20Tron%20Java/crypto/build.gradle?ref=7b9fc9f06caef84ac8c970b578b67982cb852ddf",
    "git": "https://api.github.com/repos/Savyonify/cryptApiSet/git/blobs/74fd8cc18c4e3c84bd69d8b3bc1a618a5d743bf1",
    "html": "https://github.com/Savyonify/cryptApiSet/blob/7b9fc9f06caef84ac8c970b578b67982cb852ddf/BC%20Projects/%236%20Tron%20Java/crypto/build.gradle"
  }
}

Notice the difference in url fields (which should be the same):

Search API : https://api.github.com/repositories/223952556/contents/BC%20Projects/#6%20Tron%20Java/crypto/build.gradle?ref=7b9fc9f06caef84ac8c970b578b67982cb852ddf
Bare API   : https://api.github.com/repos/Savyonify/cryptApiSet/contents/BC%20Projects/%236%20Tron%20Java/crypto/build.gradle?ref=7b9fc9f06caef84ac8c970b578b67982cb852ddf

The # in the search URL results in github treating that url as : https://api.github.com/repositories/223952556/contents/BC%20Projects/ which is a valid but also completely different than expected data. This code is where the problem occurs:

https://github.com/github-api/github-api/blob/c1c919097a80df9a230b82c091f3f99502ebb916/src/main/java/org/kohsuke/github/GHContent.java#L397-L399

This library generally still shouldn't throw a NullPointerExeceptions, but this is first and foremost a bug in the GitHub Search API. Fixing it will involve switch to constructing the url by hand, which is not reliable. Special casing for # is possible but likely to be a whack a mole of problems. If the url isn't correctly encoded for # there are probably other issues.

bitwiseman avatar Apr 21 '20 02:04 bitwiseman