PyDev.Debugger
PyDev.Debugger copied to clipboard
Optimize var_to_xml
It seems most time is used at saxutils.escape and urllib.quote, so, ideally we could have a cythonized version which does both at a single pass.
On the matter: Falcon has a utility which should be faster too:
https://github.com/falconry/falcon/blob/master/falcon/util/uri.py
As a note for potential implementors, Ideally we'd be able to create a pure python version which does the xml escaping and urllib quote/quote_plus in a single pass, so that we don't need to do all those replaces (thus always creating a new string), given that the quote/quote_plus already iterates char-by-char (and also cythonize it when cython is available).
I actually think the string concats may be taking long. BTW why are we using quote from urllib. This makes no sense to me.
What I'm trying to say is: why are we URL escaping the value, shouldn't we XML escape it instead?
The client/server protocol currently requires the urllib escaping (each command is a single line which is url-escaped -- i.e.: can't have new lines).
Now, the contents of each command is composed of a xml, so, the contents of the xml must also be xml-escaped.
This could probably be changed to use a newer protocol (such as messagepack), but this would break backwards compatibility with clients (so, would be considerably more work and the gains may still be marginal to just improving the current implementation).
Wouldn't XML escaping disallowed symbols be backward compatible?
No (clients are currently required to undo the url escape).
I think it would only not be backward compatible if the original string contains something that looks like a URL escaped pattern. Otherwise unescaping would just be a no op,right?
Agreed, if new lines are escaped and the user doesn't have any variables which would break the url unescaping it should also work.