Phalanger icon indicating copy to clipboard operation
Phalanger copied to clipboard

Implementation of 'htmlspecialchars' is not complete

Open linsmod opened this issue 8 years ago • 8 comments

the htmlspecialchars will not translate any word in this list "&", """, "'", "<", ">"

but HtmlSpecialCharsEncode does not implements the logic, so if the input contains any words in the list above, the output will be unexpected. eg. & will be translated to &

found the issue when woking with wordpress 4.6.1

linsmod avatar Oct 14 '16 17:10 linsmod

And the function should not translate symbol & to & A M P when its not an attribute. this is my personal fix ` internal static string HtmlSpecialCharsEncode(string str, int index, int length, QuoteStyle quoteStyle, string charSet) {

        if (str == null) return String.Empty;

        Debug.Assert(index + length <= str.Length);

        StringBuilder result = new StringBuilder(length);

        // quote style is anded to emulate PHP behavior (any value is allowed):
        string single_quote = (quoteStyle & QuoteStyle.SingleQuotes) != 0 ? "&#039;" : "'";
        string double_quote = (quoteStyle & QuoteStyle.DoubleQuotes) != 0 ? "&quot;" : "\"";
        var strArray = new string(str.Skip(index).Take(length).ToArray()).Split('&');
        var strList = new List<string>();
        foreach (var item in strArray)
        {
            for (int i = 0; i < item.Length; i++)
            {
                char c = item[i];
                switch (c)
                {
                    case '&':
                        result.Append("&amp;"); break;
                    case '"':
                        result.Append(double_quote); break;
                    case '\'':
                        result.Append(single_quote); break;
                    case '<':
                        result.Append("&lt;"); break;
                    case '>':
                        result.Append("&gt;"); break;
                    default:
                        result.Append(c); break;
                }
            }
            strList.Add(result.ToString());
            result.Clear();
        }
        return string.Join("&", strList);
    }`

linsmod avatar Oct 14 '16 19:10 linsmod

Does the behaviour of Phalanger differ from PHP?

It appears you are stating that Phalanger escapes &amp; to &amp;amp; when run through htmlspecialchars(). If that is your intended message, then this behaviour is consistent with the PHP implementation and thus is not a bug and will not be changed.

lucyllewy avatar Oct 16 '16 16:10 lucyllewy

Right, if you have a small test case in PHP, please try it with Phalanger and legacy PHP first whether it differs.

jakubmisek avatar Oct 16 '16 16:10 jakubmisek

code: echo 'quote_style:'.$quote_style; echo 'charset:'.$charset; echo 'double_encode:'.$double_encode; echo $string; echo htmlspecialchars($string); $string = @htmlspecialchars( $string, $quote_style, $charset, $double_encode ); die($string);

offical php output: quote_style:3charset:UTF-8double_encode:http://localhost:8000/wp-admin/load-styles.php?c=0&amp;dir=ltr&amp;load%5B%5D=dashicons,buttons,forms,l10n,login&amp;ver=4.6.1http://localhost:8000/wp-admin/load-styles.php?c=0&amp;amp;dir=ltr&amp;amp;load%5B%5D=dashicons,buttons,forms,l10n,login&amp;amp;ver=4.6.1http://localhost:8000/wp-admin/load-styles.php?c=0&amp;dir=ltr&amp;load%5B%5D=dashicons,buttons,forms,l10n,login&amp;ver=4.6.1

Phalanger output: quote_style:3charset:UTF-8double_encode:http://localhost:8000/wp-admin/load-styles.php?c=0&amp;dir=ltr&amp;load%5B%5D=dashicons,buttons,forms,l10n,login&amp;ver=4.6.1http://localhost:8000/wp-admin/load-styles.php?c=0&amp;amp;dir=ltr&amp;amp;load%5B%5D=dashicons,buttons,forms,l10n,login&amp;amp;ver=4.6.1http://localhost:8000/wp-admin/load-styles.php?c=0&amp;amp;dir=ltr&amp;amp;load%5B%5D=dashicons,buttons,forms,l10n,login&amp;amp;ver=4.6.1

linsmod avatar Oct 17 '16 04:10 linsmod

when invoke htmlspecialchars( $string ) twice, the official php and Phalanger get the same result however if the second invokation with parameters htmlspecialchars( $string ) htmlspecialchars( $string, $quote_style, $charset, $double_encode );

the test results are different.

the official php seems like fixed &amp;amp; but Phalanger not.

linsmod avatar Oct 17 '16 05:10 linsmod

Please can you tidy your test case to separate the outputs so that we can see what is output by which part of the test. It is still not clear what you are actually stating is the problem, i.e. which circumstance causes the issue you perceive. As an example, it is not clear what value $double_encode has in your example: is it true, false, empty string, null, ....?

lucyllewy avatar Oct 17 '16 10:10 lucyllewy

known little about php, how can i know the $double_encode is ture/false, or empty string or null or someting else?

linsmod avatar Oct 18 '16 04:10 linsmod

I also meet this problem when try wp4.6.1 with Phalanger. https://github.com/hlizard/WpDotNet/commit/72a0595beac77b84d5493c9a37796b674e07bca4

居然是同胞,我也是不懂php,太巧了

hlizard avatar Oct 23 '16 11:10 hlizard