u2o icon indicating copy to clipboard operation
u2o copied to clipboard

A chapter label that has a cross-reference note!

Open DavidHaslam opened this issue 7 years ago • 8 comments

I just found that this gave invalid OSIS:

\c 142
\cl 142. Mezmur \x - \xo 142:0 \xt 1Sa.22:1; 24:3\x*
\d Davut'un Maskili - Mağaradayken ettiği dua 

The OSIS snippet is as follows:

<chapter sID="Ps.142" osisID="Ps.142" n="142" />
<!-- cl --><milestone type="x-chapterLabel" n="142. Mezmur<note type="crossReference"><reference type="annotateRef" subType="x-origin">142:0 </reference><reference>1Sa.22:1; 24:3</reference></note>" />

It's evident that u2o.py was not expecting the xref and deferred appending the /> until after processing it.

I've no idea whether ParaTExt still permits this. The translators were using an older version of ParaText at the time I was provided the SFM files.

btw. The language is Turkish.

Aside: In an earlier edition of the SFM file, the xref was attached to the canonical Psalm title. One of the translators must have thought it were a good idea to move it.

DavidHaslam avatar Dec 27 '18 15:12 DavidHaslam

I'm not sure how to address this. Having the footnote in the chapter label doesn't seem right to me. I would need more information on whether this is actually acceptable in usfm.

adyeths avatar Dec 27 '18 17:12 adyeths

I will ask my contact at UBSICAP.

DavidHaslam avatar Dec 27 '18 20:12 DavidHaslam

I have asked the question by adding this issue.

DavidHaslam avatar Dec 28 '18 18:12 DavidHaslam

Meanwhile, FYI, just to record that Psalm 142 was merely the last of 12 locations in this translation where a chapter label had an xref note.

\cl 3. Mezmur \x - \xo 3:0 \xt 2Sa.15:13-17:22\x*
\cl 34. Mezmur \x - \xo 34:0 \xt 1Sa.21:13-15\x*
\cl 51. Mezmur \x - \xo 51:0 \xt 2Sa.12:1-15\x*
\cl 52. Mezmur \x - \xo 52:0 \xt 1Sa.22:9-10\x*
\cl 54. Mezmur \x - \xo 54:0 \xt 1Sa.23:19; 26:1\x*
\cl 56. Mezmur \x - \xo 56:0 \xt 1Sa.21:13-15\x*
\cl 57. Mezmur \x - \xo 57:0 \xt 1Sa.22:1; 24:3\x*
\cl 59. Mezmur \x - \xo 59:0 \xt 1Sa.19:11\x*
\cl 60. Mezmur \x - \xo 60:0 \xt 2Sa.8:13; 1Ta.18:12\x*
\cl 63. Mezmur \x - \xo 63:0 \xt 1Sa.23:14\x*
\cl 89. Mezmur \x - \xo 89:0 \xt 1Kr.4:31\x*
\cl 142. Mezmur \x - \xo 142:0 \xt 1Sa.22:1; 24:3\x*

DavidHaslam avatar Dec 28 '18 18:12 DavidHaslam

Subsequently, I have discovered that there are also 2 chapter labels with a footnote.

\cl 9. Mezmur\f + \fr 9:0 \ft Birçok İbranice elyazmasında 9. ve 10. Mezmur birleşik yazılır.\f* 
\cl 42. Mezmur\f + \fr 42:0 \ft Birçok İbranice elyazmasında 42. ve 43. Mezmur birleşik yazılır.\f* 

NB. Observe that in this case, there was no space between the end of the word Mezmur and the start of the footnote.

DavidHaslam avatar Dec 29 '18 09:12 DavidHaslam

Aside: As a provisional workaround, I made myself a simple TextPipe filter to fix these 14 instances of invalid OSIS.

TextPipe Single User Edition 10.7

Filter Title: T:\Custom\Turkish\Fix OSIS chapter labels with notes for Turkish Bible.fll

Filter List
-----------
Filter options
|  [X] Log to file
|  [ ] Append to logfile
|  Log filename: .\textpipe.log
|  Threshold 500
|  [X] Log comment filters
|
|--Input from file(s)
|     [ ] Confirm before processing each file
|     [ ] Confirm before processing read/only files
|     [ ] Delete input files after processing
|     [ ] Process inside compressed files
|     Process binary files
|   
|--Comment...
|  |  Fix OSIS chapter labels with notes for Turkish Bible
|  |  
|  |   - 12 xrefnotes
|  |   -  2 footnotes
|  |
|  +--Restrict to lines matching [<!-- cl -->]
|     |  [ ] Include line numbers
|     |  [ ] Include filename
|     |  [X] Match case
|     |  [ ] Count matches
|     |  Pattern type: 0
|     |  [X] UTF8 Support
|     |  [ ] Ignore empty matches
|     |  Context before: 0
|     |  Context after: 0
|     |
|     +--Perl pattern [(<note .+>.+</note>)(" />)] with [$2$$1]
|           [X] Match case
|           [ ] Whole words only
|           [ ] Case sensitive replace
|           [ ] Prompt on replace
|           [ ] Skip prompt if identical
|           [ ] First only
|           [ ] Extract matches
|               Maximum text buffer size 4096
|           [ ] Maximum match (greedy)
|           [ ] Allow comments
|           [ ] '.' matches newline
|           [X] UTF-8 Support

|           [ ] Process longest strings first
|           [ ] Simultaneous search
|           [ ] Log summary only
|         
+--Output to file(s)
      [ ] Only update date on changed files
      [ ] Append mode
      [ ] Change extension to: .txt
      [X] Open output file
      Only output modified files
      [ ] Remove empty output files    

Files List
----------
U:\OSIS\Turkish\Recent\bible.tur2006.osis

The essential line is this PCRE replace :

+--Perl pattern [(<note .+>.+</note>)(" />)] with [$2$$1]

The processed OSIS file passed XML syntax check and validated.

DavidHaslam avatar Dec 29 '18 09:12 DavidHaslam

See also #67.

LAfricain avatar Jan 01 '19 16:01 LAfricain

Is it a workaround for this problem?

LAfricain avatar Jan 20 '19 11:01 LAfricain