readium-js-viewer icon indicating copy to clipboard operation
readium-js-viewer copied to clipboard

Issues with <a> tags getting removed in Readium Js viewer reader.

Open 575755 opened this issue 5 years ago • 5 comments

Hi, We are facing issues with highlight feature when the migrated data coming up.

  • When we are doing a highlight in same content and same word the Xpath given by the old reader(not readium) and new reader is different.
  • After debugging we come to know that few elements are not coming in the new reader(Readium Js viewer)
  • In the actual physical file there were tags available in the .html file and these tags are missing in the content rendered in our new reader. (We have checked the elements in the developer console)

Please find the examples of old reader and new reader Xpath with the decoded details for same text below:

Old Reader:

L2h0bWwvYm9keS9wWzJdL2E6OnBhcmVudE5vZGUsL2h0bWwvYm9keS9wWzJdL2E6OnBhcmVudE5vZGUsMTgsMjc=

/html/body/p[2]/a::parentNode,/html/body/p[2]/a::parentNode,18,27

New Reader: L2h0bWwvYm9keS9wWzNdOjpwYXJlbnROb2RlLC9odG1sL2JvZHkvcFszXTo6cGFyZW50Tm9kZSwxOCwyNw==

/html/body/p[3]::parentNode,/html/body/p[3]::parentNode,18,27

Capture1_missing_a_tag Capture2_with_a_tag

575755 avatar Sep 17 '20 08:09 575755

Hi, Please take this issue as a priority .

575755 avatar Sep 18 '20 13:09 575755

When a HTML tag is present in the raw source, but missing in the computed DOM, this may be a sign that there is a mismatch between XHTML and HTML (e.g. self-closing tags). Just a thought.

danielweck avatar Sep 19 '20 07:09 danielweck

Hi Daniel,

We have only (.html) file for the physical content .In the below we have added two screenshots one is from browser perspective physical file and another one is from our code-base physical file .So we have compared both the file and tried to make changes in our code-base physical file as you suggested to check with the self-closing tags in the comment . img1_with_a_tag img2_missing_A_tag

In that file one tag is given as self-closed tag. So we tried like by changing pre-closed tag as normal .. tag . after changing we are getting like this: img1 img2

After changing all things also still same issue we are facing. So, We are not getting whether it is adding a duplicate tag or removing tag in the DOM structure in this particular issue. Because only for this tag Xpath is coming different.

575755 avatar Sep 21 '20 05:09 575755

Hi Daniel,

We have attached the screenshot of

data-src content1 data-src content2 rendered content

575755 avatar Oct 16 '20 13:10 575755

perhaps this explanation helps? https://github.com/readium/readium-js-viewer/issues/747#issuecomment-714428559

Another thing comes to mind (which could explain the closing tag issue): HTML vs. XHTML file extension triggers different parsing behaviour in some web browsers, when the HTTP header content-type is not available.

danielweck avatar Oct 22 '20 15:10 danielweck