rest.js icon indicating copy to clipboard operation
rest.js copied to clipboard

Get binary data from Content API with raw media type

Open andrewcarter opened this issue 5 years ago • 8 comments

Using repos.getContents with mediaType: { format: "raw" }

Even though the raw content type is specified, the returned data.content is still stringified so it's unsuitable to use with binary data it seems. For example, if requesting an image, the binary data will be mangled.

Is there any way to get around this with octokit/rest? The only way I found was to set mediaType: { format: "object" } and then un-base64 encode the returned content, which obviously is not optimal.

Otherwise, I may just fetch the raw data directly bypassing octokit/rest.

Relates to octokit/octokit.js#507

andrewcarter avatar Feb 01 '20 04:02 andrewcarter

Could you put together a code example as a minimal test case?

gr2m avatar Feb 01 '20 04:02 gr2m

This is the idea:

const response = await octokit.repos.getContents({
    owner: "foo",
    repo: "bar",
    path: "path/to/binary/file",
    mediaType: {
        format: "raw"
    }
});
console.log(response.data.content)

Expected: raw data of the file Actual: mangled data (UTF-8 encoded?)

I checked the response.data.content size to see if the issue was with the console.log for example, but the size of the content as returned by octokit/rest is indeed different than the actual content-length, indicating the mangling happens in the library.

andrewcarter avatar Feb 01 '20 04:02 andrewcarter

I'm heads down working on v17 right now and have to focus on that.

The lower-level method to send a request is https://github.com/octokit/request.js/, which in turn is a slight wrapper around https://github.com/bitinn/node-fetch/.

I guess it will be the same using request

const response = await request('GET /repos/:owner/:repo/contents/:path', {
    owner: "foo",
    repo: "bar",
    path: "path/to/binary/file",
    mediaType: {
        format: "raw"
    }
});
console.log(response.data.content)

Can you check how you can access the binary data correctly using node-fetch?

If there is a change necessary, it will probably need to happen in https://github.com/octokit/request.js. If you could start out creating a PR with a failing test, that'd be awesome.

gr2m avatar Feb 01 '20 06:02 gr2m

@andrewcarter did you figure out a workaround?

I think until I figure this out properly, what I would suggest you do is this

const requestOptions = octokit.repos.getContents.endpoint({
    owner: "foo",
    repo: "bar",
    path: "path/to/binary/file",
    mediaType: {
        format: "raw"
    }
});

The requestOptions look then something like this

{
  method: 'GET',
  url: 'https://api.github.com/repos/foo/bar/contents/path%2Fto%2Fbinary%2Ffile',
  headers: {
    accept: 'application/vnd.github.v3.raw',
    'user-agent': 'octokit-rest.js/0.0.0-development octokit-core.js/2.4.0 Node.js/12.15.0 (macOS Catalina; x64)'
  }
}

Now you can use any request library to send the actual request with your request method of preference, such as fetch in the browser or node-fetch in Node.js.

The proper solution would need to be tested & implemented in https://github.com/octokit/request.js/

gr2m avatar Feb 10 '20 23:02 gr2m

not sure how that workaround works with private repos. It doesn't include the token. Also, I think this is a bug with github. It seems to be returning content-type: text/plain, which is straight up wrong.

djMax avatar Mar 27 '20 03:03 djMax

with v17, you can get the current token using const { token } = await octokit.auth(), then pass it as authorization: token ${token} header

gr2m avatar Mar 27 '20 15:03 gr2m

Does the problem persist for you? Did you check in with GitHub if the problem is on their side (with the content-type: text/plain header)?

gr2m avatar Apr 18 '21 20:04 gr2m

I just stumbled over this, however its more that the behaviour of mediaType: { format: 'raw' } surprised me. I was expecting that calling octokit.rest.repos.getContent(...) with it set to raw would return a content-file schema type with the content property in utf8 instead of base64, but it actually returns the file content directly.

Hope this makes sense. It definitely is workable for me, but wanted to drop this piece of info here

NOTE: I use this lib indirectly by using '@actions/github' (so not 100% sure it fits here)

k3rnelpan1c-dev avatar Jun 28 '21 20:06 k3rnelpan1c-dev