github-api icon indicating copy to clipboard operation
github-api copied to clipboard

GHRepository.getContent does not handle properly files bigger than 1MB

Open blacelle opened this issue 3 years ago • 4 comments

Describe the bug When considering a file with size >= 1MB, the library fails with Unrecognized encoding: none in org.kohsuke.github.GHContent.read()

To Reproduce Push a file > 1MB, and try GHRepository.getContent().read()

Expected behavior Properly fetching of the file content

Additional context GitHub API has a specific behavior on large files: see https://docs.github.com/en/rest/repos/contents#size-limits.

Between 1-100 MB: Only the raw or object custom media types are supported. Both will work as normal, except that when using the object media type, the content field will be an empty string and the encoding field will be "none". To get the contents of these larger files, use the raw media type.

image

I suppose we would have in such a edge-case to rely on Accept: application/vnd.github.v3.raw header.

blacelle avatar Oct 29 '22 12:10 blacelle

@bitwiseman What's the suggestion for this problem?

antrix190 avatar Dec 20 '22 19:12 antrix190

@antrix190 In the read() method, detect the state described above and instead of refreshing make a request using the raw media type. See readBlob() for an example.

bitwiseman avatar Jun 30 '23 20:06 bitwiseman

If it works for all cases, shouldn’t we try to use raw always?

gsmet avatar Jul 01 '23 10:07 gsmet

@gsmet The raw call is an extra API call. It isn't needed in most cases - source files over a 1mb are not very common.

bitwiseman avatar Jul 01 '23 22:07 bitwiseman