html-react-parser icon indicating copy to clipboard operation
html-react-parser copied to clipboard

Parsing a file with CRLF End-of-Lines creates unwanted `__html_dom_parser_carriage_return_placeholder_ tags` and breaks NextJS hydration

Open olivierr91 opened this issue 8 months ago • 5 comments

Expected Behavior

End-of-lines with CRLF or CR should be recognized and treated properly.

Actual Behavior

When parsing HTML string containing CRLF end-of-lines, invalid tags with name _html_dom_parser_carriage_return_placeholder_<random-number>_ are created, causing the HTML to not display properly, and also causing NextJS to fail hydration.

Steps to Reproduce

  1. Load the following SVG using Node and pass the content to parse()

Image

  1. Bizarre and invalid HTML tag gets created, and the SVG does not display:

Image

  1. Save the file with LF endings instead, and retry step 2. SVg gets parsed correctly.

Reproducible Demo

Will provide if needed. But the problem here is probably faster to spot in your codebase than for me to write a demo.

Environment

  • Version: 5.2.2
  • Platform: NextJS/NodeJS
  • Browser: Edge
  • OS: Windows x64

olivierr91 avatar Apr 06 '25 03:04 olivierr91

@olivierr91 thanks for creating this issue. I'm unable to reproduce the bug that you're seeing: https://stackblitz.com/edit/html-react-parser-1755?file=src%2Findex.tsx

Just checking, is your SVG valid?

remarkablemark avatar Apr 06 '25 03:04 remarkablemark

@remarkablemark here is a complete component code that will reproduce the issue. Note that this issue only happens when using the parse() function inside a NextJS client component, no problem when used inside a server component:

"use client";
import parse from "html-react-parser";
import { ReactNode } from "react";

export default function MyClientComponent(): ReactNode {
    return parse(
        '<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 18 18"><path\r\nd="M2,18c-0.5,0-1-0.2-1.4-0.6S0,16.6,0,16V2c0-0.5,0.2-1,0.6-1.4S1.5,0,2,0h7v2H2v14h14V9h2v7c0,0.6-0.2,1-0.6,1.4S16.6,18,16,18H2z M6.7,12.7l-1.4-1.4L14.6,2H11V0h7v7h-2V3.4L6.7,12.7z" /></svg>',
    );
}

Please note the carriage return \r within the string that will trigger the issue. Is it possible the HTML parsing library used client-side is different from server-side?

Yes, this is a valid SVG. It can be validated here (after unescaping the end-of-line characters): https://validator.w3.org/check

olivierr91 avatar Apr 06 '25 04:04 olivierr91

@olivierr91 yes this library uses separate client and server HTML parsers. Are you able to replace \r with \n for your HTML string?

remarkablemark avatar Apr 06 '25 05:04 remarkablemark

@remarkablemark Yes, that is the workaround I have done. But I feel it's a bug that should be fixed. What is the library used client-side? I will open an issue directly with them.

olivierr91 avatar Apr 07 '25 15:04 olivierr91

See html-dom-parser

This may be related to https://github.com/remarkablemark/html-dom-parser/pull/902 and https://github.com/remarkablemark/html-dom-parser/pull/923

remarkablemark avatar Apr 07 '25 17:04 remarkablemark