git-annex-adapter
git-annex-adapter copied to clipboard
Call git-annex commands from Python
================= Git-Annex-Adapter
This package lets you interact with git-annex_ from within Python.
Necessary commands are executed using subprocess and use their
batch versions whenever possible.
.. _git-annex: https://git-annex.branchable.com/
I'm developing this as needed, so feel free to ask if there's any functionality you want me to implement.
You might also like to check out datalad_'s annexrepo_ support package which is probably more featureful than this.
.. _datalad: https://github.com/datalad/datalad/ .. _annexrepo: https://docs.datalad.org/en/stable/generated/datalad.support.annexrepo.html#module-datalad.support.annexrepo
Requirements
- Python 3
- git-annex 6.20170101 (or later)
- pygit2 0.24 (or later)
Usage
To create a git-annex repository from scratch::
>>> from pygit2 import init_repository
>>> from git_annex_adapter import init_annex
>>> init_repository('/path/to/repo')
pygit2.Repository('/path/to/repo/.git/')
>>> init_annex('/path/to/repo')
git_annex_adapter.repo.GitAnnexRepo(/path/to/repo/.git/)
To wrap an existing git-annex repository::
>>> from git_annex_adapter.repo import GitAnnexRepo
>>> repo = GitAnnexRepo('/tmp/repo')
The GitAnnexRepo is a subclass of pygit2.Repository. Git-annex specific
functionality is accessed via the annex property of it, which is
a mapping object from git-annex keys to AnnexedFile objects::
>>> for key in repo.annex:
... print(key)
SHA256E-s3--2c26...
SHA256E-s3--baa5...
SHA256E-s3--fcde...
>>> key = 'SHA256E-s3--2c26...'
>>> repo.annex[key]
git_annex_adapter.repo.AnnexedFile('SHA256E-s3--2c26...')
You can also get a tree representation of any git tree-ish object with
annexed file entries replaced with AnnexedFile objects::
>>> tree = repo.annex.get_file_tree() # treeish='HEAD'
>>> tree
git_annex_adapter.repo.AnnexedFileTree(4d7f...)
>>> set(tree)
{'foo', 'bar', 'baz', 'README', 'directory'}
>>> tree['foo']
git_annex_adapter.repo.AnnexedFile(SHA256E-s3--2c26...)
>>> tree['directory']
git_annex_adapter.repo.AnnexedFileTree(8b54...)
>>> tree['directory/file'] # or tree['directory']['file']
<pygit2.Blob object at 0x...>
The AnnexedFile objects can be used to access and manipulate
information about a file.
The metadata property of the AnnexedFile is a mutable mapping
object from fields to sets of values::
>>> foo = tree['foo']
>>> for field, values in foo.metadata:
... print('{}: {}'.format(field, values))
author: {'me'}
numbers: {'1', '2', '3'}
>>> foo.metadata['numbers'] |= {'0'}
>>> foo.metadata['numbers'] -= {'3'}
>>> foo.metadata['numbers']
{'0', '2'}
>>> del foo.metadata['author']
>>> 'author' in foo.metadata
False
>>> foo.metadata['lastchanged']
'2017-07-19@15-00-00'
Calling Processes
If you need low-level access to the git-annex processes, you can do it
via the classes included in process module::
>>> from git_annex_adapter.process import ...
Subclasses of GitAnnexBatchProcess return relevant output (usually
one line or a dict object) whenever called with a line of input.
For example, git-annex metadata --batch --json::
>>> proc = GitAnnexMetadataBatchJsonProcess('/path/to/repo')
>>> proc(file='foo')
{..., 'key':'SHA256E-s3--2c26...', 'fields': ...}
>>> proc(file='foo', fields={'numbers': ['1', '2', '3']})
{..., 'key': ..., 'fields': {'numbers': ['1', '2', '3'], ...}}
Subclasses of GitAnnexRunner call a single program with
different arguments. They return a subprocess.CompletedProcess
when called, which captures stdout and stderr. For example, to run
git-annex version::
>>> runner = GitAnnexVersionRunner('/path/to/repo')
>>> runner(raw=True)
CompletedProcess(..., stdout='6.20170101', stderr='')
>>> print(runner().stdout)
git-annex version: 6.20170101
...