grass icon indicating copy to clipboard operation
grass copied to clipboard

[Feat] g.extension: improved git support

Open ninsbl opened this issue 5 years ago • 17 comments

Is your feature request related to a problem? Please describe. Since GRASS moved to git it would be nice to support more options to work with git repositories and more specifically, common git work flows. This includes:

  • [ ] authentication (see #1162) With proxies some authentication is already supported and with switching from urllib to requests (#624, see: https://requests.readthedocs.io/en/master/user/authentication/) also authentication via .netrc file is (probably) implemented (and thus just needs test and documentation). However, adding options for authentication explicitly might make it easier for users to use e.g. private repos...
  • [x] branches (implemented in PR #1130) When providing the full download link to a zip file, people can already install AddOns from specific branches. That however requires some knowledge of the functions on git platforms (andsome extra steps to get the full URL). Just pointing to a repository URL fetches master by default. Should be relatively easy to implement...
  • [x] forks (see #1177) If people have forks of the official repository, it would be nice to be able to install AddOns from branches in those repos for testing... This might require a flag or the like to indicate that the given URL represents a fork of the official repo...
  • [ ] add tests for download of repos (see PR #1158) Currently, installation from online resources is not tested. This should be added, at least for the most common options.
  • [ ] add documentation for installation from git repositories (see #1162) A short git section and mainly examples should do...
  • [ ] make sure other related tickets are sufficiently addressed:

Describe the solution you'd like

  1. Update the manual for most relevant git workflows.
  2. Implement missing features in g.extension

Additional context Aim would be to foster usage/development and contribution...

ninsbl avatar May 12 '20 10:05 ninsbl

Also (suggestion by Panos) support to install an add-on from a specific commit.

NikosAlexandris avatar Jul 09 '20 08:07 NikosAlexandris

Branch option implemented in: 1ed8b2c

ninsbl avatar Dec 02 '20 21:12 ninsbl

Backported to the upcoming GRASS GIS 7.8.5 in d3d71a0

neteler avatar Dec 02 '20 21:12 neteler

New related issue reported in #1150

neteler avatar Dec 05 '20 10:12 neteler

Also (suggestion by Panos) support to install an add-on from a specific commit.

@tmszi implemented that in 13ff843

ninsbl avatar Dec 15 '20 09:12 ninsbl

Since tagging will not be possible in grass-addons, we would like to refer to fixed commit to access an addon at a given timestamp.

Example r.learn.ml2, the version from May 2020: https://github.com/OSGeo/grass-addons/tree/d033fe07998066af5b50ea3b3adb2b587e767a76/grass7/raster/r.learn.ml2

Would be

g.extension extension=r.learn.ml2 branch=d033fe07998066af5b50ea3b3adb2b587e767a76

Would that be feasible?

anikaweinmann avatar Dec 16 '20 14:12 anikaweinmann

I see, that way you would get a "stable" version of an addon and exclude one possible point of failure...

With the current approach using svn export, I am afraid, that is not possible. For svn export there is a revision option, but I did not get it to work with a git hash...

I looked a bit into alternatives to svn export. The only solutions I found were to either download a full archive (.zip or .tar.gz) or downloading single files through the API (https://github.com/OSGeo/grass/pull/1177#issuecomment-745298056) which can be a bit cumbersome esp. for bundeled modules...

ninsbl avatar Dec 16 '20 21:12 ninsbl

The following code (not fine tuned):

#!/usr/bin/env python3

import os
import urllib
import json

user = "ninsbl"
repo = "grass-addons"
reference = 'd033fe07998066af5b50ea3b3adb2b587e767a76'

original_url = "https://api.github.com/repos/{user}/{repo}/contents/".format(user=user, repo=repo)

addon_dirs = ["grass7/imagery/i.sentinel"]

def download_git(base_url, dir, reference, lstrip=2):
    req = urllib.request.urlopen(base_url + dir + '?ref=' + reference)
    content = json.loads(req.read())

    directories = []
    for element in content:
        path = os.path.join(*element['path'].split('/')[lstrip:])
        print(path)
        if not os.path.exists(os.path.dirname(path)):
            os.makedirs(os.path.dirname(path))

        if element['download_url'] is not None:
            urllib.request.urlretrieve(element['download_url'], path)
        else:
            directories.append(element['path'])
    return directories

while addon_dirs:
    print(addon_dirs)
    for addon_dir in addon_dirs:
        addon_dirs.remove(addon_dir)
        addon_dirs += download_git(original_url, addon_dir, reference)

would download a given addon (here i.sentinel), through the GitHub API recursively (meaning all subdirectories in case of bundeled modules or for modules with libraries). I have not tested fully if the commit-reference works as expected

That would remove dependencies to svn. The API could also be used instead of git e.g. to list addons. With moving to the API it would also be possible to fetch addons from github for windows at least Python AddOns, as an alternative to downloading precompiled addons (see also https://trac.osgeo.org/grass/ticket/3298)... In addition, we could make make a context dependent dependency (only for C modules or python modules with a library). That would make at least some g.extension functionality available for people without access to a build system...

The downside is that it downloads each file individually and uncompressed. On my box with a decent internet connection, that is ~ 8 sec for i.sentinel, compared to ~ 5 sec when using svn export...

That would however be a bigger change and requires sufficient discussion. Personally, I think the advantages of dropping svn and maybe even git dependencies outweigh disadvantages of increased data transfer... But there can be different opinions and I may overlook something important...

ninsbl avatar Dec 17 '20 08:12 ninsbl

@ninsbl I tested different things to download a single folder from github, but does not found one that is really faster than svn export. On my maschine your code above is sometimes faster and sometimes a little slower than svn export.

I have found a git alternative to svn export, but it takes much longer (45 sec instead of 7 or 8 sec):

from git import Repo # python -m pip install gitpython
import subprocess
import grass.script as grass

# mkdir folder && cd folder
curr_path = os.getcwd()
new_repo_path = grass.tempdir()

# git init
new_repo = Repo.init(new_repo_path)

# git remote add origin https://github.com/OSGeo/grass-addons.git
origin = new_repo.create_remote('origin', 'https://github.com/OSGeo/grass-addons.git')
os.chdir(new_repo_path)

# git config core.sparseCheckout true
process = subprocess.run(['git', 'config', 'core.sparseCheckout', 'true'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if process.stderr != b'':
    grass.fatal(_("Configuring the single extension does not work: <git config core.sparseCheckout true> failed"))

# echo "grass7/imagery/i.sentinel"  > .git/info/sparse-checkout
extension_folder = "grass7/imagery/i.sentinel"
with open('.git/info/sparse-checkout', 'w') as git_conf:
    git_conf.write(extension_folder)

# git pull origin master # I'm not sure if you need the pull, but in python I have not been able to get it to run otherwise
origin.pull('master')

# git checkout -b hash_branch 0e61c94d2b4aab5f22a6b001cf0b2dc2c46662ba
try:
    new_repo.git.checkout(reference, b="hash_branch")
except Exception as e:
    grass.fatal(_("Reference <{}> not found".format(reference)))
os.chdir(curr_path)

anikaweinmann avatar Jan 29 '21 15:01 anikaweinmann

@anikaweinmann , thanks for following up. IMHO, the two main arguments of e.g. switching to API download (instead of svn / git) are: a) to reduce dependncies for g.extension (Thus, I would also avoid new dependencies like gitpython.) b) allow install of an extension with a specific version (since we have no AddOn releases / release branches).

If there are no major performance issues with API download, I would say those two reasons outweigh a few seconds in download time and a few bytes of additional network traffic, by far. i.sentinel is an addon collections with relatively many files and also images in the documentation. So they are in a way the worst case scenario for the API download. And here the performance difference is 4 sec (svn export) to 3-8 sec (API download). For smaller addons like e.g v.median, the difference is 2 (svn) to 1.5 (API), where my test included the start of the Python interpreter. So, API download may even give a performance gain (though tat is irrelevant given a) and b).

The API download should probably be tested on a slower internet connection...

ninsbl avatar Feb 02 '21 10:02 ninsbl

@ninsbl would it be possible to add support for this? (at least for Python based addons):

Example:

# install actinia command line execution tool
C:\>g.extension extension=importer url=https://github.com/mundialis/ace
Downloading precompiled GRASS Addons <ace>...
ERROR: Cannot open URL:
       http://wingrass.fsv.cvut.cz/grass78/x86_64/addons/grass-7.8.5/ace.zip

The code is there but with the GitHub style name scheme:

  • https://github.com/mundialis/ace/releases/tag/1.0.0
    • --> https://github.com/mundialis/ace/archive/1.0.0.zip

neteler avatar Mar 12 '21 10:03 neteler

This should work on Linux: g.extension url=https://github.com/mundialis/ace/archive/1.0.0.zip Given the error message I assume you are looking for a way to install python-addons on MS Windows?

ninsbl avatar Mar 12 '21 17:03 ninsbl

Yes, sorry, I forgot to add that it doesn't work on Windows.

neteler avatar Mar 12 '21 19:03 neteler

After the move from svn to git and with restructured addons repository, it is a bout time for some general refactoring of g.extension. See: #1700 for additional discussion of structural improvements. That includes possibly renaming: #1688

ninsbl avatar Jul 21 '21 14:07 ninsbl

Please see/consider also #376.

NikosAlexandris avatar Sep 04 '21 07:09 NikosAlexandris

Shouldn't this milestone be moved to 8.2?

veroandreo avatar Jan 18 '22 19:01 veroandreo

Here is how git command can be used to get an addon together with history of the code:

time (git clone --no-checkout --single-branch -b grass8 --filter=tree:0 https://github.com/OSGeo/grass-addons.git git_test; cd git_test; git sparse-checkout init --cone; git sparse-checkout set src/vector/v.stream.order; git checkout grass8; cd src/vector/v.stream.order; git log -n 1 .)

Or getting a version with a specific commit hash:

time (git clone --no-checkout --single-branch -b grass8 --filter=tree:0 https://github.com/OSGeo/grass-addons.git git_test; cd git_test; git sparse-checkout init --cone; git sparse-checkout set src/vector/v.stream.order; git checkout 5e55ded; cd src/vector/v.stream.order; git log -n 1 .)

ninsbl avatar May 07 '23 20:05 ninsbl