cofoundry icon indicating copy to clipboard operation
cofoundry copied to clipboard

Concurrency issue in HtmlSanitizer

Open JornWildt opened this issue 2 years ago • 8 comments

I found this error in the Cofoundry error log - it seems like there is a concurrency issue in HtmlSanitizer.

Message: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.

   at System.Collections.Generic.HashSet`1.Contains(T item)
   at Ganss.XSS.HtmlSanitizer.IsAllowedAttribute(IAttr attribute)
   at Ganss.XSS.HtmlSanitizer.<DoSanitize>b__127_1(IAttr a)
   at System.Linq.Enumerable.WhereEnumerableIterator`1.ToList()
   at Ganss.XSS.HtmlSanitizer.DoSanitize(IHtmlDocument dom, IParentNode context, String baseUrl)
   at Ganss.XSS.HtmlSanitizer.SanitizeDom(String html, String baseUrl)
   at Ganss.XSS.HtmlSanitizer.Sanitize(String html, String baseUrl, IMarkupFormatter outputFormatter)
   at Cofoundry.Core.Web.Internal.HtmlSanitizer.Sanitize(String source, IHtmlSanitizationRuleSet ruleSet)
   at Cofoundry.Core.Web.Internal.HtmlSanitizer.Sanitize(IHtmlContent source)
   at Cofoundry.Domain.HtmlSanitizerHelper.Sanitize(IHtmlContent source)
   at AspNetCore._PageBlockTypes_RichTextWithMedia_RichTextWithMedia.ExecuteAsync() in /PageBlockTypes/RichTextWithMedia/RichTextWithMedia.cshtml:line 8
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, Boolean invokeViewStarts)
   at Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at Cofoundry.Web.RazorViewRenderer.RenderViewAsync(ViewContext viewContext, ViewEngineResult viewResult, Object model)
   at Cofoundry.Web.PageBlockRenderer.RenderBlockAsync(ViewContext viewContext, IEditablePageViewModel pageViewModel, IEntityVersionPageBlockRenderDetails blockViewModel)
   at Cofoundry.Web.PageTemplateRegionTagBuilder.RenderBlocksToHtml(PageRegionRenderDetails pageRegion, Dictionary`2 regionAttributes, VisualEditorState visualEditorState)
   at Cofoundry.Web.PageTemplateRegionTagBuilder.RenderRegion(PageRegionRenderDetails pageRegion)
   at Cofoundry.Web.PageTemplateRegionTagBuilder.InvokeAsync()

JornWildt avatar Nov 09 '22 21:11 JornWildt

First up can I check a couple of things:

  • Are you definately awaiting any async calls in your views/templates?
  • Is this an intermittent issue, or does it happen on every render?
  • Did anything change prior to the issue occuring e.g. version update
  • What version of Cofoundry / .NET are you running?

HeyJoel avatar Nov 10 '22 12:11 HeyJoel

Are you definately awaiting any async calls in your views/templates?

Sorry, can't answer that - that is rather difficult to verify.

Is this an intermittent issue, or does it happen on every render?

Certainly an intermittent issue - the log indicates that there were three simultaneous requests to the website.

Did anything change prior to the issue occuring e.g. version update

No.

What version of Cofoundry / .NET are you running?

Cofoundry: 0.9.1 (Nuget package number). Target framework is netcoreapp3.1.

JornWildt avatar Nov 10 '22 20:11 JornWildt

Looking into "mganss", there error is here: https://github.com/mganss/HtmlSanitizer/blob/master/src/HtmlSanitizer/HtmlSanitizer.cs#L638

        private bool IsAllowedAttribute(IAttr attribute)
        {
            return AllowedAttributes.Contains(attribute.Name)
                // test html5 data- attributes
                || (AllowDataAttributes && attribute.Name != null && attribute.Name.StartsWith("data-", StringComparison.OrdinalIgnoreCase));
        }

Here AllowedAttributes is a standard HashSet - which is not thread safe, so two concurrent requests can run into the reported problem.

Probably not something Cofoundry can do anything about - besides adding some locking statements.

JornWildt avatar Nov 10 '22 20:11 JornWildt

I don't see why there would be a concurrency issue with the HashSet unless it was being altered while the check was being made, and on a cursory look over the code I can't see how that might happen, given we create the mganss sanitizer and the ruleset on demand and have everything registered as transient. That's why I was thinking about un-awaited async calls in your views, as I wouldn't expect anything in a request to run concurrently anyway.

Are you altering the sanitizer rulesets at all?

There must be something somewhere, but I'm not seeing it as yet, I will probably have to do a deeper dive into the code.

HeyJoel avatar Nov 10 '22 22:11 HeyJoel

I don't see why there would be a concurrency issue with the HashSet unless it was being altered while the check was being made

Now, I haven't experienced it myself with HashSet<T> - but with Dictionary<T> you cannot even have two concurrent readers! It is most likely the same with HashSet<T>. So don't look for concurrent read/write - all you need is two concurrent reads on the same collection - and that is not difficult to imagine.

JornWildt avatar Nov 10 '22 22:11 JornWildt

For the code here (any code in fact) to be thread safe, you need to use the concurrent versions of the collections. You have to be quite unlucky to experience issues - but it does happen.

JornWildt avatar Nov 10 '22 22:11 JornWildt

The docs for Dictionary indicate that you can have two concurrent readers, but not if there are concurrent writes. The docs don't mention anything for HashSet, but my assumption would be the same - I'm not expecting the HashSet to be modified after it's been constructed, but looking at the error message that mentions "a concurrent update was performed...", it seems it must be.

I'll need to do a deeper dive when i have some time, but can you let me know if you are you altering/customizing the sanitizer rulesets at all?

HeyJoel avatar Nov 11 '22 09:11 HeyJoel

Okay, I must have been mistaken somehow. Pretty sure that we had issues with reading a standard dictionary, but cannot reproduce it.

I have not, consciously, done anything to modify the sanitizer rulesets.

JornWildt avatar Nov 11 '22 14:11 JornWildt

As part of the v0.12 release there are improvements in both the mganss html sanaitizer and the Cofoundry implementation to improve the immutability of sanitizer configuration, which should illiminate the kinds of issues you are seeing.

I wasn't able to replicate the issue though so feel free to reopen the issue if you're still having problems after the update. I will close this for now; the changed will be released with v0.12.

HeyJoel avatar Apr 03 '24 13:04 HeyJoel