A chapter label that has a cross-reference note!
I just found that this gave invalid OSIS:
\c 142
\cl 142. Mezmur \x - \xo 142:0 \xt 1Sa.22:1; 24:3\x*
\d Davut'un Maskili - Mağaradayken ettiği dua
The OSIS snippet is as follows:
<chapter sID="Ps.142" osisID="Ps.142" n="142" />
<!-- cl --><milestone type="x-chapterLabel" n="142. Mezmur<note type="crossReference"><reference type="annotateRef" subType="x-origin">142:0 </reference><reference>1Sa.22:1; 24:3</reference></note>" />
It's evident that u2o.py was not expecting the xref and deferred appending the /> until after processing it.
I've no idea whether ParaTExt still permits this. The translators were using an older version of ParaText at the time I was provided the SFM files.
btw. The language is Turkish.
Aside: In an earlier edition of the SFM file, the xref was attached to the canonical Psalm title. One of the translators must have thought it were a good idea to move it.
I'm not sure how to address this. Having the footnote in the chapter label doesn't seem right to me. I would need more information on whether this is actually acceptable in usfm.
I will ask my contact at UBSICAP.
I have asked the question by adding this issue.
Meanwhile, FYI, just to record that Psalm 142 was merely the last of 12 locations in this translation where a chapter label had an xref note.
\cl 3. Mezmur \x - \xo 3:0 \xt 2Sa.15:13-17:22\x*
\cl 34. Mezmur \x - \xo 34:0 \xt 1Sa.21:13-15\x*
\cl 51. Mezmur \x - \xo 51:0 \xt 2Sa.12:1-15\x*
\cl 52. Mezmur \x - \xo 52:0 \xt 1Sa.22:9-10\x*
\cl 54. Mezmur \x - \xo 54:0 \xt 1Sa.23:19; 26:1\x*
\cl 56. Mezmur \x - \xo 56:0 \xt 1Sa.21:13-15\x*
\cl 57. Mezmur \x - \xo 57:0 \xt 1Sa.22:1; 24:3\x*
\cl 59. Mezmur \x - \xo 59:0 \xt 1Sa.19:11\x*
\cl 60. Mezmur \x - \xo 60:0 \xt 2Sa.8:13; 1Ta.18:12\x*
\cl 63. Mezmur \x - \xo 63:0 \xt 1Sa.23:14\x*
\cl 89. Mezmur \x - \xo 89:0 \xt 1Kr.4:31\x*
\cl 142. Mezmur \x - \xo 142:0 \xt 1Sa.22:1; 24:3\x*
Subsequently, I have discovered that there are also 2 chapter labels with a footnote.
\cl 9. Mezmur\f + \fr 9:0 \ft Birçok İbranice elyazmasında 9. ve 10. Mezmur birleşik yazılır.\f*
\cl 42. Mezmur\f + \fr 42:0 \ft Birçok İbranice elyazmasında 42. ve 43. Mezmur birleşik yazılır.\f*
NB. Observe that in this case, there was no space between the end of the word Mezmur and the start of the footnote.
Aside: As a provisional workaround, I made myself a simple TextPipe filter to fix these 14 instances of invalid OSIS.
TextPipe Single User Edition 10.7
Filter Title: T:\Custom\Turkish\Fix OSIS chapter labels with notes for Turkish Bible.fll
Filter List
-----------
Filter options
| [X] Log to file
| [ ] Append to logfile
| Log filename: .\textpipe.log
| Threshold 500
| [X] Log comment filters
|
|--Input from file(s)
| [ ] Confirm before processing each file
| [ ] Confirm before processing read/only files
| [ ] Delete input files after processing
| [ ] Process inside compressed files
| Process binary files
|
|--Comment...
| | Fix OSIS chapter labels with notes for Turkish Bible
| |
| | - 12 xrefnotes
| | - 2 footnotes
| |
| +--Restrict to lines matching [<!-- cl -->]
| | [ ] Include line numbers
| | [ ] Include filename
| | [X] Match case
| | [ ] Count matches
| | Pattern type: 0
| | [X] UTF8 Support
| | [ ] Ignore empty matches
| | Context before: 0
| | Context after: 0
| |
| +--Perl pattern [(<note .+>.+</note>)(" />)] with [$2$$1]
| [X] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
| [ ] Maximum match (greedy)
| [ ] Allow comments
| [ ] '.' matches newline
| [X] UTF-8 Support
| [ ] Process longest strings first
| [ ] Simultaneous search
| [ ] Log summary only
|
+--Output to file(s)
[ ] Only update date on changed files
[ ] Append mode
[ ] Change extension to: .txt
[X] Open output file
Only output modified files
[ ] Remove empty output files
Files List
----------
U:\OSIS\Turkish\Recent\bible.tur2006.osis
The essential line is this PCRE replace :
+--Perl pattern [(<note .+>.+</note>)(" />)] with [$2$$1]
The processed OSIS file passed XML syntax check and validated.
See also #67.
Is it a workaround for this problem?