dspace-angular
dspace-angular copied to clipboard
HTML tags in administrative workflow view are being evaluated
Describe the bug First of all, I am not sure whether this is really a bug or more like a feature. Since I came across this while doing some XSS attempts on a local DSpace 7.1.1 setup, I entered some HTML/JS code in the resource submission form. The sanitizer of Angular appears to detect my attempts and makes them safe. Still, in the administrative workflow the HTML tags are evaluated:
To Reproduce
Steps to reproduce the behavior:
Submit new resource, enter
<img src="https://upload.wikimedia.org/wikipedia/en/thumb/9/9a/Trollface_non-free.png/220px-Trollface_non-free.png" onerror=alert(document.cookie);>
in any of the text fields.
Hit deposit and check administrative worklow.
Expected behavior HTML tags are stripped entirely or at least are not being evaluated as in the other views.
Thanks @firoball for the ticket.
- Verified this does NOT occur on Item pages (i.e.
/items/[uuid]
). On these pages HTML is not evaluated - However, it strangely DOES occur on any lists of items (seach/browse/mydspace), in either the
dc.title
ordc.description.abstract
fields. Javascript in HTML doesn't appear to execute...but I can get images to load and also links or formatting to work.
This is unexpected behavior. I'll bring this over onto our board to work on.
We have some contributors who write quite elaborate abstracts using XMLUI's "Simple HTML Fragment" feature. We'll have to clean up these metadata if text-field markup is unimplemented, or implemented in a different way, and the demand will still exist.
A quick look at lightweight markup languages says to me that they are all much too powerful for metadata markup. If we can eliminate external references (links and images) then this becomes much less problematic. Maybe an examination of print journals will suggest a small set of essential features. From a security standpoint, a small language that only deals with formatting would be best.
OTOH I recall that some sites have requested additional markup such as MathML.
NOTE for future work. This needs discussion. The initial ticket here implies that these HTML tags should not be evaluated. But, it seems we were evaluating some tags in XMLUI, see https://wiki.lyrasis.org/display/DSDOC7x/Simple+HTML+Fragment+Markup
Therefore, we should discuss whether we can bring that XMLUI feature forward, while also ensuring that XSS (and similar) attacks cannot be achieved (it seems Angular blocks that though, but it may need more testing).
See also #1404
A lot could be accomplished if we could simply avoid normalizing whitespace in these fields. Then one could have line breaks and paragraph breaks.
See also #1763 which is loosely related
As explained in #1762, we definitely need line breaks to not be stripped and to be displayed.
We have enabled HTML to the extent provided in XMLUI and have many abstracts that rely on it for proper display. While security is paramount, it would be great if we could continue to offer the option to support HTML to the level currently offered in XMLUI.
I'd rate line break preservation and display absolutely essential and continued HTML support desirable but not essential.