bibxml-service
bibxml-service copied to clipboard
Check whether `re.escape()` can be used in equal comparisons
I think re.escape()
is supposed to be used with PostgreSQL’s like_regex
operator, when we want to make a literal string part of a regular expression safely.
It is mostly used this way.
However, there are a couple of places in this service where it is also used in literal comparisons, for example:
https://github.com/ietf-ribose/bibxml-service/blob/b49a3614a92b4f5db03f4b38d25074930dc0a5f6/xml2rfc_compat/fetchers.py#L39-L40
https://github.com/ietf-ribose/bibxml-service/blob/b49a3614a92b4f5db03f4b38d25074930dc0a5f6/main/tests/test_query.py#L94
I suspect this could introduce undesirable behavior. For example, re.escape()
escapes dots (re.escape('"foo.bar"') == '"foo\\.bar"'
), meaning if docid
contains a dot then PostgreSQL may not match it with ==
operator.
Could we check whether re.escaping is safe there, and if not removing or replacing it with other way of escaping (e.g., using json.dumps()
on those strings)?