nuxt-security icon indicating copy to clipboard operation
nuxt-security copied to clipboard

Script injection vulnerability in combination with useHead

Open jschroeter opened this issue 8 months ago • 8 comments
trafficstars

Environment

- Operating System: Linux
- Node Version:     v18.20.3
- Nuxt Version:     3.16.0
- CLI Version:      3.23.0
- Nitro Version:    2.11.6
- Package Manager:  [email protected]
- Builder:          -
- User Config:      compatibilityDate, devtools, modules
- Runtime Modules:  [email protected]
- Build Modules:    -

Nuxt Security Version

v2.2.0

Default setup used?

Yes, the bug happens even if the security option is not customized

Security options


Reproduction

  • open https://stackblitz.com/edit/nuxt-starter-94vb5abo?file=app.vue,nuxt.config.ts
  • it's a default Nuxt project with default nuxt-security config
  • in app.vue, the string maliciousUserInput contains two script tags
  • this string is used as value for useHead/useHeadSafe link/meta tags

Description

  • when nuxt-security is turned off, the script is not executed as it's just a string inside of a meta tag attribute value (<meta property="og:title" content="<script></script><script>alert(1)</script>">). All good.
  • but when enabling nuxt-security, it adds nonce attributes which then result in <meta property="og:title" content="<script nonce="xXNuFdmH0kSn/o6NLkUvBI1S"></script><script nonce="xXNuFdmH0kSn/o6NLkUvBI1S">alert(1)</script>">; hence it terminates the content attribute and the script injection is successful. When opening the page, an alert is shown - which should not happen.
  • using useHeadSafe only helps partly, it prevents the script injection on the <link> tag but the one in <meta> tag still persists

On first thought, I see two solutions:

  1. make nuxt-security nonce logic smarter to not replace
  2. adjust useHead/useHeadSafe to encode <> characters so that nuxt-security's logic won't match anymore

What do you think?

Additional context

No response

Logs


jschroeter avatar Mar 18 '25 16:03 jschroeter

Hey there,

Thanks for reporting this issue. Interesting case to be sure!

@vejja do you have some ideas why this happends? :)

Baroshem avatar Mar 19 '25 10:03 Baroshem

Hi all This is essentially the same problem as #594 @GalacticHypernova would you by any chance have some time to modify the regex?

@Baroshem The issue is on our side, and is related to the regex mentioned in #594. Basically we are replacing all <script> elements without verifying if they are valid HTML elements. If you remember, we used to rely on Cheerio, which correctly parses the DOM element by element, but it was too resource-intensive, especially in workers. We moved to regex-based parsing, and our regex is too lax

The correct solution is as suggested by @jschroeter:

  1. make nuxt-security nonce logic smarter to not replace

vejja avatar Mar 19 '25 11:03 vejja

This issue is indeed the downside of having regex based parsing, but I wouldn't necessarily say it's an issue of the module.

Having regex based parsing ensures valid scripts are captured:

<div>"<script>alert(1)</script>"</div>

But it is also stateless, and can't know the context of the match. I could look into modifying the regex a bit, but its complexity will undoubtedly increase, by a margin dependent on how the build process would parse it in different configurations.

For example, if one disables minifications, and the build process bundles it as is, there may be a need to handle multiline attribute patterns to capture the following and correctly assess that it's safe:

<div
class="someClass"
someAttr="<script>alert(1)</script>"
>

There could potentially be other such edge cases that would require the pattern to grow in complexity if we want to accurately handle it all, but I believe there's also the question of how much we can truly do all the work for users.

I believe everyone knows that placing script tags in any tag is bad practice, even if it's a simple string. We mustn't forget that the more complex the patterns are, the more expensive and resource-intensive they become. Just like if developers use direct v-html at runtime on unsanitized user input, there are certain standards that any developer should follow as it is infeasible for an automated algorithm to capture all potential quirks.

That being said, if you'd like me to look into trying to improve the regex, let me know!

GalacticHypernova avatar Mar 19 '25 15:03 GalacticHypernova

I’m wondering whether we should modify our algorithm.

Instead of trying to craft the perfect regex (which, as @GalacticHypernova rightly says, will exponentially grow in complexity), why don’t we parse the input string recursively until we find a <script> tag ?

The logic could be:

  • use regex to find any pair of matching opening and closing tags (carefully detecting self-closing elements)
  • move inside the inner text inside that pair
  • start again recursively until we find a script

That logic would mimic the standard strategy of an html parser, eliminating false positives like the one evidenced in this issue. In addition, it would be more readable and maintainable than a greedy recursive regex.

What do you guys think ?

vejja avatar Mar 19 '25 19:03 vejja

Thanks a lot for looking into it asap and proposing a fix - and also for keeping an eye on performance!

Just my 2 cents on...

I believe everyone knows that placing script tags in any tag is bad practice, even if it's a simple string.

Yes and no :-) Obviously it's bad practice but it can happen with user generated content. Now we can argue that the dev has to manually sanitise all input, which is kind of what the Unhead docs suggests. But when reading the Nuxt docs it sounds like simply using useHeadSafe is fine, like the name implies. We could open an issue there and suggest that useHeadSafe should encode < and >? But actually there is no problem with those characters as long as you don't use nuxt-security.

I wouldn't necessarily say it's an issue of the module

Like said, there is no issue if the module isn't used. People add nuxt-security to improve security but in this case ironically a new issue occurs. So if possible, I think it would make most sense to fix it here.

Let me know when I can help e.g. by trying it out.

jschroeter avatar Mar 21 '25 09:03 jschroeter

Hi @jschroeter Unfortunately the PR is only a very preliminary draft. I asked Copilot to propose a fix and it kind of understood what the intent was but it is not addressing the issue. I’m short of time right now, obviously if you want to give it a shot it would be wonderful. I’ll try to find time later to fix

vejja avatar Mar 27 '25 16:03 vejja

happen with user generated content

I get that, but ideally this situation should never happen. It's the same concept as using v-html, which directly bypasses all security restrictions. It's just something that shouldn't happen in the first place.

This issue has 2 sides:

  1. Using regex makes it far more memory efficient, which is critical for constrained environments like serverless or cloud infrastructure that bills on usage. But it also goes off of the assumption that the entire HTML is controlled by the developer, and puts trust that the developer follows web standards. Due to it being stateless, it can't reliably know the context of the tag without becoming exponentially resource-intensive and thus negating the very benefit it came to utilize.
  2. Not using regex (i.e. actually parsing the HTML) address the stateless downside of regex, and provides increased accuracy, but becomes a bottleneck in constrained environments.

If I look at this at an overview, the impact of using a parser is far worse than that of the stateless regex. Many people use serverless deployments, far more than those who create html based on user content. By addressing the accuracy concern, we will indirectly introduce a critical bottleneck that will render many serverless-deployed apps unusable, and thus impact a lot more developers.

This situation isn't really easy to solve, and it requires some careful planning. If there can't be a thorough enough regex while maintaining performance, it would be a serious risk to try and use a parser or something else that's far more memory intensive.

I have been experimenting a bit with regexes but I am yet to find a solution, I'll keep looking though.

GalacticHypernova avatar Mar 28 '25 12:03 GalacticHypernova

Hey!

I sincerely apologize for the amount of time it took me to come up with a solution to this.

Right about the time I made the previous iteration, I've gotten extremely busy with my personal life and had practically no time to attend to this properly.

I've made a PR to address this issue #658

GalacticHypernova avatar Oct 29 '25 11:10 GalacticHypernova