gumbo-query
gumbo-query copied to clipboard
Crash when select string include '(' char.
I am searching a "script" node in facebook html source, the node is like <.s.c.r.i.p.t.>require("TimeSlice").guard(function() ... ///< The dots in "script" is for showing this line normally in issue page.
So I use this selector to find this node CSelection c = doc.find("script:contains(require("TimeSlice"))");
But, the app crashed with error "terminate called after throwing an instance of 'std::string'", GDB says it crash in doc.find function.
If I use CSelection c = doc.find("script:contains(require)"), it works well. But these nodes are not what I want. So, I think gumbo-query's "contains" filter does not support '(' in it.
No, gumbo-query does not properly parse string sequences like this. Once the parser encounters a "special" character like "(" or even """, it will discard whatever previous operation it was tasked with doing (such as parsing a string) and change its task based on the newly discovered "special" character.
I'm not criticizing @lazytiger when I criticize stuff like this, just to be clear. He said he blindly ported cascadia over without doing too much testing, assuming that cascadia worked well. The problem is that cascadia has lots of bugs, _lots_ of them. This is one such example, improper string parsing. It doesn't parse strings by consuming all data between two matching unescaped quote characters, but rather has a per-character context, checking every single character starting at a quote for a "special" character, then assuming it's done parsing the string once it finds one.