webtrees
webtrees copied to clipboard
CONC line generated in SOUR:PAGE if exporting a source citation with too many characters
Observed with webtrees 2.1.16
If entering a source citation with more than the allowed 248 characters (WHERE_WITHIN_SOURCE:= {Size=1:248}) , webtrees accepts the long citation text. However, when exporting the GEDCOM, a CONC line is added. This results in a non-standard SOUR:PAGE:CONC export.
Expected behavior: Source citation text with more than the allowed 248 characters should be cut to 248 characters while entering the data.
GEDCOM snippet after entering the data and viewing with "Edit the raw GEDCOM"
0 @I7118@ INDI
1 EMIG
2 SOUR @S10309@
3 PAGE Schiffspassagierliste "La Olivia", Hamburg - Buenos Aires, 10.9.1927, https://www.ancestry.de/sharing/3420816?mark=7b22746f6b656e223a225a7534354e6d356547517064755645566463655763755a6654355434494e494b4d654f53795349425345513d222c22746f6b656e5f76657273696f6e223a225632227d
GEDCOM snippet after export from webtrees"
0 @I7118@ INDI
1 EMIG
2 SOUR @S10309@
3 PAGE Schiffspassagierliste "La Olivia", Hamburg - Buenos Aires, 10.9.1927, https://www.ancestry.de/sharing/3420816?mark=7b22746f6b656e223a225a7534354e6d356547517064755645566463655763755a6654355434494e494b4d654f53795349425345513d222c22746f6b656e5f76657
4 CONC 273696f6e223a225632227d
You raise two issues.
-
GEDCOM 5.5.1 says that lines must be "no longer than 255 characters", implying that shorter lines are OK. It also recommends breaking long lines after a non-space character - which would also imply lines shorter than 255. Indeed, I have seen many GEDCOM files that wrap at 80 characters. Thus I believe we are allowed to add
CONCpretty much anywhere. -
allowing
PAGEvalues longer than 248 characters. The max length is missing in the code. I'll add it.
Thus I believe we are allowed to add CONC pretty much anywhere.
It just looked very weird to me on first sight, but I agree.
allowing PAGE values longer than 248 characters. The max length is missing in the code. I'll add it.
I also think that this is the kernel of the issue. Thank you for looking at it.
I'ld like to understand the proposed fix. I've been relying on webtree's non strict implementation of the 5.5.1 standard, mostly to accommodate Family Search's own "cite this record" recommendation. I liked it, because even though it is not guaranteed to transmit to other genealogy applications - at least it was perfectly human readable and consistent with CONC usage.
I have a large number of existing non compliant page records and need to understand if and how I need to change them.
If the PAGE values are to be clipped can it accommodate a pasting workflow? So that the whole line is available for editing (to see what to delete to get it into the character limit) without throwing away clipped data.
If the PAGE values are to be clipped can it accommodate a pasting workflow?
This is my concern - and the reason why I have not (yet??) implemented this.
In GEDCOM 7.0, all length limits are removed.
I think that pasting an over-long value into a citation field, and having it silently truncated would be a very poor user experience.
The policy of webtrees is to be able to handle all GEDCOM - whether valid or invalid - but to encourage new data to be valid.
I think we should replace the "hard limit" with a soft limit. If you try to enter too many characters, you will see a warning - but they will be accepted.
I disagree that a CONC after PAGE is a valid construct for strict v5.5.1 GEDCOM.
However, since v5.5.1 allows custom tags anywhere, if we allow PAGE text greater than 248, then a _CONC would be allowed.
PAGE. _CONC could then be converted into an unlimited payload for v7 GEDCOM in the future!