dpm-js icon indicating copy to clipboard operation
dpm-js copied to clipboard

Datapackage name differences

Open morty opened this issue 9 years ago • 10 comments

The package name in the tree output at the end of running dpm install and the directory created to hold the downloaded datapackage may not match if the name in the datapackage.json is not the same as that returned by okfn/datapackage-identifier's parse function (which uses the URL to work out the name).

e.g.

curl http://example.com/foo/datapackage.json
{
  "name": "bar",
  ...
}

Using dpm install on this URL will put the files in datapackages/foo but the tree output at the end of the run will show datapackages/bar.

morty avatar Jul 30 '15 18:07 morty

Confirmed, for example https://gist.github.com/mchelen/7c972c58f921c58d8c32:

$ dpm install https://gist.githubusercontent.com/mchelen/7c972c58f921c58d8c32/raw/c57c987daf16f11ab4477ccfb76be780a32769f2/datapackage.json
dpm http GET https://gist.githubusercontent.com/mchelen/7c972c58f921c58d8c32/raw/c57c987daf16f11ab4477ccfb76be780a32769f2/data.csv
dpm http 200 https://gist.githubusercontent.com/mchelen/7c972c58f921c58d8c32/raw/c57c987daf16f11ab4477ccfb76be780a32769f2/data.csv
.
└─┬ datapackages
  └─┬ blargh
    ├── datapackage.json
    └─┬ data
      └── data.csv

$ find .
.
./datapackages
./datapackages/c57c987daf16f11ab4477ccfb76be780a32769f2
./datapackages/c57c987daf16f11ab4477ccfb76be780a32769f2/data.csv
./datapackages/c57c987daf16f11ab4477ccfb76be780a32769f2/datapackage.json

I guess the question is which one should it be, the URL or the name? I'm thinking name because someone could host the files on any random directory structure.

mchelen avatar Aug 05 '15 00:08 mchelen

I have done some research about this days ago. If we can confirm that the name at datapackage.json should be the directory name for the install, I can create a PR with the changes.

alvaropinot avatar Apr 04 '16 16:04 alvaropinot

Thanks @alvaropinot there's actually an open issue on whether name stays as a required attribute on the datapackage.json: https://github.com/dataprotocols/dataprotocols/issues/237

danfowler avatar Apr 07 '16 15:04 danfowler

@alvaropinot as per @danfowler i think this is something that may change and there are some other juicier items to work on if you are interested :smile: so i'd suggest we leave this one.

rufuspollock avatar Apr 08 '16 10:04 rufuspollock

@rgrp sure, I'll be glad to help in whatever could be more juicier. Just tell :D

alvaropinot avatar Apr 08 '16 22:04 alvaropinot

@alvaropinot fantastic! OK how about looking at the "render" stuff e.g. #48. I've been working on the underlying lib so it would be good to sync - can you jump on https://gitter.im/frictionlessdata/chat and ping me ...

rufuspollock avatar Apr 09 '16 10:04 rufuspollock

Experiencing the same behavior. When installing a package from GitHub, dpm uses the branch name as the installation folder (master), overwriting previous resource files if they share the same file names.

Perhaps dpm could use name as the installation folder, and default to the last part of the URL if name is not present.

inigoflores avatar Apr 27 '16 16:04 inigoflores

@inigoflores thanks for reporting. There are two issues here:

  • Using master as the data package name - that is a definite bug
  • Using name attribute vs package "name" in terms of url

As per above discussion we are considering deprecating name. Obviously in that case we need to use something like the url or equivalent to create a storage path (a bit like go). We are planning to work on this asap.

In the meantime, we should try and fix the bug with master - if you can track that down that would be super helpful -- probably an issue in datapackage-identifier

rufuspollock avatar Apr 27 '16 21:04 rufuspollock

@rgp thanks for your prompt response.

  • Regarding master as the package name, it's not a new bug per se. It's just that the URL to datapackage.json contains master as the last part of the path.

    dpm install https://raw.githubusercontent.com/codeforspain/ds-empleo/master/datapackage.json
    dpm install https://raw.githubusercontent.com/codeforspain/ds-organizacion-administrativa/master/datapackage.json
    

    Therefore, dpm installs every package under datapackages/master. Sorry for not describing the problem better.

  • As per deprecating name, I've read with interest issue dataprotocols/dataprotocols#237, and I see the case for moving towards an unique ID. However, I would prefer dealing with names than with IDs.

What I was suggesting is to implement the following behavior:

  • If name is present, use it as the installation folder.
  • If name is missing, use id instead.
  • If both are missing (I don't believe this scenario is allowed) use the URL.

Not sure if this makes sense.

inigoflores avatar Apr 28 '16 09:04 inigoflores

I've just discovered by reading the docs at /doc/command-identifier.md that you can actually install a package through its GitHub URL:

dpm install https://github.com/codeforspain/ds-empleo
dpm install https://github.com/codeforspain/ds-organizacion-administrativa

This solves my problem, as packages are installed under the right folder.

Perhaps these instructions should appear on Readme.md (or at least a link).

Thanks!

inigoflores avatar May 10 '16 18:05 inigoflores