htmlpurifier
htmlpurifier copied to clipboard
False postive "x<y"
We noticed an issue with the html purifier used in SuiteCRM.
Cleaning the value of x<y
results in just x
, while x < y
results in the correct x < y
While trying it seems like as soon as I do not have a space after <
everything after it will be stripped.
I can reproduce this with the demo app, to verify its not an issue directly with SuiteCRM - http://htmlpurifier.org/demo.php?filter%5BAutoFormat.AutoParagraph%5D=0&filter%5BAutoFormat.DisplayLinkURI%5D=0&filter%5BAutoFormat.Linkify%5D=0&filter%5BAutoFormat.RemoveEmpty.Predicate%5D=colgroup%3A%0D%0Ath%3A%0D%0Atd%3A%0D%0Aiframe%3Asrc%0D%0A&filter%5BAutoFormat.RemoveEmpty%5D=0&filter%5BAutoFormat.RemoveSpansWithoutAttributes%5D=0&filter%5BNull_CSS.AllowedProperties%5D=1&filter%5BCore.CollectErrors%5D=0&filter%5BHTML.Allowed%5D=z&filter%5BHTML.Doctype%5D=XHTML+1.0+Transitional&filter%5BHTML.SafeObject%5D=0&filter%5BHTML.TidyLevel%5D=light&filter%5BURI.DisableExternalResources%5D=0&filter%5BNull_URI.Munge%5D=1&html=aaa+x%3Cz+sdgdfg&submit=Submit
Input
test x<y test
Output
test x
Expected
text x<y test
Options
Not sure which filter gets this, so here the full config
$config = \HTMLPurifier_Config::createDefault();
$baseConfigs = [];
$baseConfigs['HTML.Doctype'] = 'XHTML 1.0 Transitional';
$baseConfigs['Core.Encoding'] = 'UTF-8';
$hidden_tags = array('script' => true, 'style' => true, 'title' => true, 'head' => true);
$baseConfigs['Core.HiddenElements'] = $hidden_tags;
$baseConfigs['URI.Base'] = $sugar_config['site_url'] ?? null;
$baseConfigs['CSS.Proprietary'] = true;
$baseConfigs['HTML.TidyLevel'] = 'none';
$baseConfigs['HTML.ForbiddenElements'] = array('body' => true, 'html' => true);
$baseConfigs['AutoFormat.RemoveEmpty'] = false;
$baseConfigs['Cache.SerializerPermissions'] = 0775;
$baseConfigs['Filter.ExtractStyleBlocks.TidyImpl'] = false;
$baseConfigs['Output.FlashCompat'] = true;
$baseConfigs['HTML.DefinitionID'] = 'Sugar HTML Def';
$baseConfigs['HTML.DefinitionRev'] = 2;
$baseConfigs['Attr.EnableID'] = true;
$baseConfigs['Attr.IDPrefix'] = 'sugar_text_';
foreach ($baseConfigs as $key => $value) {
$config->set($key, $value);
}
$purifier = new \HTMLPurifier($config);
echo $purifier->purify('test x<y test') . "\n";
test z<y test
is not valid HTML. Wrapping it with a doctype, html, body, etc will ensure it's processed correctly. The Core.ConvertDocumentToFragment
may also work...