linked-places-format icon indicating copy to clipboard operation
linked-places-format copied to clipboard

Add something like `parent_id` to TSV specification

Open frederik-elwert opened this issue 6 years ago • 6 comments

Currently, the TSV specification allows only for a free-text entry for parent. This makes it impossible to link to parent entries that are part of the same dataset. Would it make sense to add a column like parent_id which allows to specify an id corresponding to the parent entry?

frederik-elwert avatar Aug 16 '19 12:08 frederik-elwert

Yes it does make sense, thanks. At the moment I'm working on a few modifications to this TSV spec based on other feedback, and will add this to that list. Should be up for comment within a few days.

kgeographer avatar Aug 16 '19 16:08 kgeographer

I've labeled the existing TSV spec as v0.1 and created a draft v0.2 and modified the examples. Labeled these "for comment" -- before coding the parsing in WHG, I'd want to hear comments, corrections, etc.

Many thanks for weighing in

kgeographer avatar Aug 19 '19 15:08 kgeographer

Just a request for clarification: Now the spec for parent_id states:

URI for a web-published record of the parent_name above

How would I describe that an entity in the same file is the parent, which would have an id, but not necessarily a web-published record (yet)? Would something like #parent123 work? (Resembling a local id reference in XML.)

frederik-elwert avatar Sep 08 '19 07:09 frederik-elwert

Ah, good point. When dataset files are uploaded to WHG, records are assigned a placeid in our system that will remain constant through any future updating. So they are effectively published and web-accessible. If parents are uploaded separately and first, then their URIs can be used in files that follow, but it is unreasonable to expect that workflow.

So would it work to allow (and parse) values like "#2345" for parent_id? On import, rows having a "#" in that position would be processed last, after placeids had been assigned to the previous.

kgeographer avatar Sep 08 '19 08:09 kgeographer

Yes, that sounds reasonable. In practice, I assume processing might become a bit more complex when more than two levels of hierarchy are included. But I guess this could be solved.

frederik-elwert avatar Sep 08 '19 10:09 frederik-elwert

Yes, solvable. Probably simplest as a database operation after all rows have been inserted. Settling this spec now means modifying a bunch of code and sample datasets, so I need to make upgrades to the spec as seldom as possible. Thanks again.

kgeographer avatar Sep 08 '19 18:09 kgeographer