task icon indicating copy to clipboard operation
task copied to clipboard

feat: Support multiple providers for Remote Taskfiles and support directories

Open pbitty opened this issue 1 year ago • 5 comments

This PR aims to support multiple providers for Remote Taskfiles using go-getter, as I proposed in this comment in [#1317]. It is currently marked as Draft as I aim to get some early feedback, and iterate on it if we want to go forward with this approach or something similar.

This would support the following shapes of URLs as provided by go-getter (non-exhaustive):

URL Description
https://example.com/Taskfile.yaml File over HTTP
https://example.com/some-archive.zip Directory from ZIP archive over HTTP
github.com/go-task/task//testdata/includes/Taskfile.yml?ref=some_branch Directory from Github ref
[email protected]:repo/path Directory from any Git repository (via local git executiy using SSH)
my-bucket.s3.amazonaws.com/path/to/Taskfile.yml File from S3
my-bucket.s3.amazonaws.com/path/to/folder/ Directory from S3
my-bucket.s3.amazonaws.com/path/to/archive.zip ZIP file from S3
my-bucket.s3.amazonaws.com/path/to/archive.zip?taskfile=AnotherTaskfile.yml ZIP file from S3 with non-standard entrypointTaskfile

The aims of the change are as follows:

  • To support multiple source providers for remote taskfiles
  • To provide a well-defined syntax for source URLs that can support future providers if desired
  • To support downloading directories containing remote taskfiles and supporting files, not only single taskfiles
  • As an extension, we support downloading directories in the form of compressed archives (eg. zip, tgz, etc.) thanks to go-getter

These are not directly tied to go-getter but they are satisfied by the library fairly well.


Some points of note with the current implementation:

  • There doesn't seem to be a need for multiple remote nodes - as long as one satisfies the Getter and Detector interfaces, one can plug a new provider into RemoteNode.

  • The contract between Node and the rest of the code is that the Node places its content in a directory and provides FileDirectory and Filename, which are used for accessing the Taskfile and Taskfile directory and to set variables like .TASKFILE_DIR.

  • FileNode could eventually be replaced by logic in RemoteNode as well, if we'd like. go-getter supports local files by using symlinks and we could leverage this. I'm just not sure how well this would on Windows - I would be glad to test it.

  • The interaction between Node, Reader and Cache feels a bit confusing right now in terms of their lifecycle. It feels like we read the node contents a bit too late in the game, and that we could read the contents and deal with caching before we return a Node to the Reader. I didn't try to change this yet - I wanted to get some eyes on the overall idea, then make a plan to refactor it - either in this PR or a future one.

    ie: instead of calling NewNode() in the reader and then later call node.Read(), we would call something like r.loadNode() which already returns a fully loaded Node either from the cache or from remote. In other words, node initialization would include "reading". This would allow us to return a node with an immutable FileDirectory. Currently we have to replace the node if we end up loading its content from the cache because the directory changes.

This is also a big PR and I'd be happy to break it up into more manageable pieces if there's consensus. For example, some test and node/cache changes could be done first before adding go-getter.

What do you all think about the above?

pbitty avatar Aug 27 '24 18:08 pbitty

I realize this PR is pretty big and I am making many decisions here. If you'd like to talk about this in real-time perhaps on Discord, let me know!

pbitty avatar Aug 27 '24 21:08 pbitty

This was meant to be marked as Draft from the start - I only noticed it now.

pbitty avatar Sep 09 '24 19:09 pbitty

+1 from me as I desperately need GitLab support. Currently there’s no way to provide a GITLAB_TOKEN for repos that require authentication when using remote taskfiles.

mgbowman avatar Sep 10 '24 17:09 mgbowman

I don't love the way this PR turned out. I see two areas that are awkward right now:

How the FileNode and RemoteNode integrate with each other

Each node needs to be able to handle relative include paths in its own includes via ResolveEntrypoint. This requires that the FileNode be aware of remote nodes, and vice-verse. I have to investigate this a bit.

One idea is to have a single node type that handles file and remote. Another is to refactor this more and separate the path resolution aspect from the download aspect - although I'm not sure how much they can be separate since they depend on each other.

The sequence between node loading and caching

Currently I am storing remote node data in a temp directory until the user accepts it, then we move it to the cache. This means the node's "source" data points to the temp directory and we must reload it if we end up using the cached version (here). I'm trying to think if there's simpler logic there.

Caching behaviour in general

With the ability to download directories, this opens the door to larger downloads. The behaviour of automatically downloading every time feels a bit heavy. I wonder if we could have an option to change the default so that once a remote include is cached, we will always use the cache. For people using versioned refs (eg. v1.2.3) this shouldn't be an issue since the ref content is expected to be immutable.

In this case two options I see are: a) Decide to change the default to always use the cache if it's there and rely on --download to force a refresh b) Add an env var (or CLI arg or both) that allows us to invert the behaviour

pbitty avatar Sep 12 '24 16:09 pbitty

Thanks for your work on this! I would definitely be keen for an option to use the cache by default. To me it ties in nicely to this other feature request I created https://github.com/go-task/task/issues/1402

c-ameron avatar Sep 16 '24 13:09 c-ameron

This approach seems a bit heavy-handed and it's trying hard to contort go-getter into the shapes of go-task. I will propose a lighter approach.

pbitty avatar Nov 04 '24 18:11 pbitty