github3.py
github3.py copied to clipboard
file_contents can't be used with large files, but Contents() objects from directory_contents can
When dealing with files over 1mb via the contents api, an attempt to get a Contents() object via the repo.file_contents() call fails because it attempts to download the file contents at the same time.
However, directory_contents will work because it does not populate the contents of the Content objects until refresh is called.
This inconsistency is somewhat annoying: when using directory_contents you need to remember to call refresh, whereas when you want to just get the metadata of a specific file you need to fallback to iterating directory contents.
It would be nice if file_contents could be given a parameter like "no_content" and under the hood use the directory contents method of listing files so a directory-like content object could be obtained.
EDIT:
Played around with this a bit more and it's even worse - the pseudo contents objects point to the correct file right up until you call refresh on them - at which point they change what they point to entirely:
c = [ c for c in repo.directory_contents("src/main/json", ref="refs/heads/rouesnwi-ansible") if c[0] == "assets.json" ][0][1]
c.sha
Out[327]: u'6e88c54be71bdd9c8ca2978998b3b9efb0a7f76a'
c.refresh()
Out[328]: <Content [src/main/json/assets.json]>
c.sha
Out[329]: u'26d7663a95934d55ca1a9fc7705ca76ff82e9689'