bibxml-service icon indicating copy to clipboard operation
bibxml-service copied to clipboard

Check whether `re.escape()` can be used in equal comparisons

Open strogonoff opened this issue 2 years ago • 0 comments

I think re.escape() is supposed to be used with PostgreSQL’s like_regex operator, when we want to make a literal string part of a regular expression safely.

It is mostly used this way.

However, there are a couple of places in this service where it is also used in literal comparisons, for example:

https://github.com/ietf-ribose/bibxml-service/blob/b49a3614a92b4f5db03f4b38d25074930dc0a5f6/xml2rfc_compat/fetchers.py#L39-L40

https://github.com/ietf-ribose/bibxml-service/blob/b49a3614a92b4f5db03f4b38d25074930dc0a5f6/main/tests/test_query.py#L94

I suspect this could introduce undesirable behavior. For example, re.escape() escapes dots (re.escape('"foo.bar"') == '"foo\\.bar"'), meaning if docid contains a dot then PostgreSQL may not match it with == operator.

Could we check whether re.escaping is safe there, and if not removing or replacing it with other way of escaping (e.g., using json.dumps() on those strings)?

strogonoff avatar Jun 13 '22 04:06 strogonoff