Fix the bug regarding unquoted strings in collection types in the DSL
Explores a solution to #1630
As shown in the issue above, we have a problem in our regex DSL regarding the quotation of text elements in Python collection types. Quite simply, we must put between quotations marks everything that ends up being a string if it's used in a collection type to respect Python grammar. The issue is that if those same elements are not used in a collection type, then there is no need to add those quotations marks (and we should not add them as the user never asked for it).
We need to have a system that allows us to add these quotes when handling collection types within our DSL. We need to:
- Automatically assess whether some elements require to be quoted (for instance basic Python types)
- Let the user specify whether they should be for others (for instances for the
Regexclass) - Do both of those above for all terms and python types beyond the basic elements containing a single element. This is required as a collection type could contain for instance a
Sequence, aLiteral, aQuantifyMinimum... and we want to quote the whole containing term, not the items is contains
The solution envisioned uses 2 properties possessed by each term, knowing that all Python types are eventually turned into terms (the naming of those properties will be improved):
requires_quoting: tells whether this term should be quoted if it where in a collection type. The value of the property can either be set by the user or is deduced from the type of term or the other terms it contains.apply_quotation: whether the term is in a collection type such that it should be quoted (if applicable). The value of the property default toFalse, but is then turned toTruein thepython_types_to_termsfunction depending on whether the term is contained in a collection type.
In the to_regex function, we wrap the content of the term in repr if both properties above are True