nh3 icon indicating copy to clipboard operation
nh3 copied to clipboard

nh3 clean doesn't include html, head or body tags even when included in ALLOWED_TAGS

Open barkhabol opened this issue 1 year ago • 1 comments
trafficstars

While using nh3 library, we came across a use case, where HTML content is expected for a field, but we need to remove the content that can cause XSS attack. Using nh3.clean() directly on the input text doesn't give the expected result and a lot of useful data is getting trimmed ultimately modifying the html template input.

import nh3
text = '''
<!DOCTYPE html>
<html>
<head>
  <title>HTML Tutorial</title>
</head>
<body>
  <h1>This is a heading</h1>
  <p>This is a paragraph.</p>
</body>
</html>
'''

nh3.ALLOWED_TAGS.add('title')
nh3.ALLOWED_TAGS.add('head')
nh3.ALLOWED_TAGS.add('html')
nh3.ALLOWED_TAGS.add('div')
nh3.ALLOWED_TAGS.add('body')

print(nh3.clean(text,tags=nh3.ALLOWED_TAGS,strip_comments=False))

Output: 
<title>HTML Tutorial</title>
 <h1>This is a heading</h1>
 <p>This is a paragraph.</p> 

We don't want to trim the html or head or body tags. Is there any limitation to nh3 library which does not allow these tags?

barkhabol avatar Dec 11 '23 11:12 barkhabol

Blocked on https://github.com/rust-ammonia/ammonia/issues/183

messense avatar Dec 13 '23 06:12 messense