vite icon indicating copy to clipboard operation
vite copied to clipboard

Vite fails to inject assets into the correct place in the document.

Open jjchmielowiec opened this issue 8 months ago • 2 comments

Describe the bug

The Problem

I am trying to build my project, but Vite is injecting my scripts and css into a string literal within an inline script instead of inside the real

block. This is clearly happening because the string literal contains the first instance of the string </head> and assumes this is the end of the head block. The reproduction URL has a very similar case where it is injecting into a comment instead of a string literal, but the problem is the same.

The Cause

In the file packages/vite/src/node/plugins/html.ts, starting at line 1467 is a section on injecting into various parts of the HTML document. It is evident right away that this is using regex to identify the correct location to inject. It is widely known that HTML and RegEx don't mix particularly well. To make matters more difficult, RegEx in JS does not support variable length look behind assertions. Because of the limitations of RegEx, especially in JS, RegEx is not a robust solution for identifying the end of a head block, or really any other location in an HTML document that relies on understanding HTML structure.

Solution

Without a reliable way to identify the correct location using RegEx, my recommendation is to instead use JSDOM. This is likely a less performant solution, but it would be more robust and reliable. If performance is a concern, a flag could be added to the config to use the RegEx version instead. In my case lower performance would be an acceptable trade off for correctness.

Reproduction

https://stackblitz.com/edit/vitejs-vite-fly8vq?file=index.html

Steps to reproduce

The problem will be reproduced any time the document has the string </head> above the real closing tag to the head block.

run vite build in the terminal at the reproduction URL and compare the index.html in the build to the source index.html. You will see the script tag is inserted in the comment instead of the actual html.

System Info

Though it is likely irrelevant in this case:
Windows 10, vite, vite-plugin-inline-source, vite-plugin-minify

Used Package Manager

npm

Logs

No response

Validations

jjchmielowiec avatar Apr 08 '25 17:04 jjchmielowiec

I just ran into this issue when setting up a project and had some commented out code that included </head> above the 'real' and uncommented out </head>

lordmcfuzz avatar Apr 22 '25 18:04 lordmcfuzz

For an even sillier example that breaks: <!doctype html><title>The <header> element</title> Vite will mistake the text <header> inside that title for a <head> tag.

I alleviated a similar issue a while back in hugo and found the same problem in VS Code a couple weeks ago. Haven't found the time make a PR there or here yet, but feel free to steal the code I wrote there.

The problem is threefold:

  • Using regular expressions to naively search for tags within an arbitrary HTML fragment is impossible. Many constructs can be nested and hold arbitrary text that looks like tags, so determining if a piece of code is actually a tag is impossible without context.
  • Beyond that, using regular expressions in the form /<tagname[^>]*>/ to match entire tags isn't reliable. This leads to injections after <header> instead of <head> or inside an attribute value that contains a >.
  • Scanning through the whole document for a specific injection point, then falling back to searching for another injection point, is inefficient and increases the risks of false positives.

I recommend scanning the document from the start, and only reading past whitespace, comments, the doctype, the html tag and the head tag. HTML comments cannot be nested so they can be consumed reliably, and the risk of encountering a breaking attribute on the <html> or <head> tags are low.

DominoPivot avatar May 02 '25 06:05 DominoPivot