mia icon indicating copy to clipboard operation
mia copied to clipboard

Support downloading/uploading entire data directory

Open anowell opened this issue 10 years ago • 6 comments
trafficstars

Would like to download an entire data directory via: algo download <data-directory-uri> [local-directory]

and be able to do so with concurrency similar to how algo upload works.

(Waiting for #1 to replace how concurrency works)

anowell avatar Aug 29 '15 02:08 anowell

Might just use 'sync' instead of upload / download On Aug 28, 2015 7:56 PM, "Anthony Nowell" [email protected] wrote:

Would like to download an entire data directory via: algo download [local-directory]

and be able to do so with concurrency similar to how algo upload works.

(Waiting for #1 https://github.com/algorithmiaio/algorithmia-cli/issues/1 to replace how concurrency works)

— Reply to this email directly or view it on GitHub https://github.com/algorithmiaio/algorithmia-cli/issues/2.

Argoday avatar Aug 29 '15 17:08 Argoday

I'd actually like to use 'cp'... I think it's the same as your suggested 'sync', but semantically more like scp.

algo cp [-r] <source>... <dest> (bonus points for aliasing to acp)

However, if you allow either source or dest to arbitrarily be a data URI, then you need the data:// protocol prefix to resolve ambiguity (.my/foo could be a local directory). I don't like the ergonomics of forcing data:// on the command-line, so I favor deducing user intent in the most common, least ambiguous cases. I propose these ambiguity resolution rules (that only apply if none of the args use a "data://" prefix):

  1. If a single source is specified
    • If source resolves locally, but dest does not and dirname(dest) is not a local directory: upload
    • If dest resolves locally, but source does not: download
  2. If multple sources are specified
    • If all sources resolve locally and dest does not: upload
    • If dest resolves locally, but any sources do not: download
  3. All other cases are ambiguous and result in a warning to explicitly use data:// prefix

Notes:

  • Ambiguity resolution could vary between machines. Examples and usage help should emphasize using data:// for maximum portability.
  • We could still add ambiguity resolution for direct copy between data URIs in the future (when neither source nor dest nor dirname(dest) resolve locally).

anowell avatar Aug 31 '15 19:08 anowell

'cp' ~= 'sync' , either term is good

I don't like going down the route of solving the ambiguity problem here ... just have one side or both have the data:// prefix ... it is:

  1. simple
  2. not surprising
  3. doesn't depend on local state to know what it does

Argoday avatar Aug 31 '15 20:08 Argoday

Note: scp is fully deterministic and does not rely on local state instead choosing to use remote decorators

Argoday avatar Aug 31 '15 21:08 Argoday

  1. it's also the only command in the entire algo utility that would require the data:// prefix.

I also don't like the dependence on local state. This is why I've punted so far on implementing 'cp' and stuck with separate 'download' and 'upload' commands. But I also prefer that a CLI tool assumes intuitive/expected behavior in the sloppy cases (as long as it provides a way to be explicit, e.g. curl guesses protocol if not specified, scp assumes username based on local state if not specified).

I imagine this as a first experience with the Data API from the CLI:

$ algo ls
anowell
$ algo ls anowell
$ algo mkdir anowell/foo
Created directory data://anowell/foo
$ algo ls anowell
foo
$ algo cp myfile.txt anowell/foo
Warn: potentially ambigous paths - prefix remote paths with data:// to avoid this warning.
Uploaded data://anowell/foo/myfile.txt

I just find myself thinking "why make that an error, and force them to re-type it when we can confidently know what they intended"

anowell avatar Sep 01 '15 18:09 anowell

of course, for the sake of arguing with myself:

  1. adding ambiguity resolution is backward compatible. removing it is not.

anowell avatar Sep 01 '15 18:09 anowell