webtrees icon indicating copy to clipboard operation
webtrees copied to clipboard

CONC line generated in SOUR:PAGE if exporting a source citation with too many characters

Open Jefferson49 opened this issue 2 years ago • 5 comments

Observed with webtrees 2.1.16

If entering a source citation with more than the allowed 248 characters (WHERE_WITHIN_SOURCE:= {Size=1:248}) , webtrees accepts the long citation text. However, when exporting the GEDCOM, a CONC line is added. This results in a non-standard SOUR:PAGE:CONC export.

Expected behavior: Source citation text with more than the allowed 248 characters should be cut to 248 characters while entering the data.

GEDCOM snippet after entering the data and viewing with "Edit the raw GEDCOM"

0 @I7118@ INDI
1 EMIG
2 SOUR @S10309@
3 PAGE Schiffspassagierliste "La Olivia", Hamburg - Buenos Aires, 10.9.1927, https://www.ancestry.de/sharing/3420816?mark=7b22746f6b656e223a225a7534354e6d356547517064755645566463655763755a6654355434494e494b4d654f53795349425345513d222c22746f6b656e5f76657273696f6e223a225632227d

GEDCOM snippet after export from webtrees"

0 @I7118@ INDI
1 EMIG
2 SOUR @S10309@
3 PAGE Schiffspassagierliste "La Olivia", Hamburg - Buenos Aires, 10.9.1927, https://www.ancestry.de/sharing/3420816?mark=7b22746f6b656e223a225a7534354e6d356547517064755645566463655763755a6654355434494e494b4d654f53795349425345513d222c22746f6b656e5f76657
4 CONC 273696f6e223a225632227d

Jefferson49 avatar May 05 '23 05:05 Jefferson49

You raise two issues.

  1. GEDCOM 5.5.1 says that lines must be "no longer than 255 characters", implying that shorter lines are OK. It also recommends breaking long lines after a non-space character - which would also imply lines shorter than 255. Indeed, I have seen many GEDCOM files that wrap at 80 characters. Thus I believe we are allowed to add CONC pretty much anywhere.

  2. allowing PAGE values longer than 248 characters. The max length is missing in the code. I'll add it.

fisharebest avatar May 05 '23 14:05 fisharebest

Thus I believe we are allowed to add CONC pretty much anywhere.

It just looked very weird to me on first sight, but I agree.

allowing PAGE values longer than 248 characters. The max length is missing in the code. I'll add it.

I also think that this is the kernel of the issue. Thank you for looking at it.

Jefferson49 avatar May 05 '23 16:05 Jefferson49

I'ld like to understand the proposed fix. I've been relying on webtree's non strict implementation of the 5.5.1 standard, mostly to accommodate Family Search's own "cite this record" recommendation. I liked it, because even though it is not guaranteed to transmit to other genealogy applications - at least it was perfectly human readable and consistent with CONC usage.

I have a large number of existing non compliant page records and need to understand if and how I need to change them.

If the PAGE values are to be clipped can it accommodate a pasting workflow? So that the whole line is available for editing (to see what to delete to get it into the character limit) without throwing away clipped data.

tronsmit avatar May 09 '23 17:05 tronsmit

If the PAGE values are to be clipped can it accommodate a pasting workflow?

This is my concern - and the reason why I have not (yet??) implemented this.

In GEDCOM 7.0, all length limits are removed.

I think that pasting an over-long value into a citation field, and having it silently truncated would be a very poor user experience.

The policy of webtrees is to be able to handle all GEDCOM - whether valid or invalid - but to encourage new data to be valid.

I think we should replace the "hard limit" with a soft limit. If you try to enter too many characters, you will see a warning - but they will be accepted.

fisharebest avatar May 09 '23 20:05 fisharebest

I disagree that a CONC after PAGE is a valid construct for strict v5.5.1 GEDCOM.

However, since v5.5.1 allows custom tags anywhere, if we allow PAGE text greater than 248, then a _CONC would be allowed.

PAGE. _CONC could then be converted into an unlimited payload for v7 GEDCOM in the future!

Norwegian-Sardines avatar May 10 '23 04:05 Norwegian-Sardines