ert icon indicating copy to clipboard operation
ert copied to clipboard

Improvement when parsing ert-config

Open ManInFez opened this issue 3 years ago • 7 comments

When parsing arguments for forward model jobs or workflow jobs we see some strange behaviour when escaping certain characters:

  • spaces works only in double quotes
  • commas works only with single quotes

Only one should be necessary

FORWARD_MODEL REPLACE_STRING(<FROM>='some, thing', <TO>="some stuff", <FILE>=file.txt) : success
FORWARD_MODEL REPLACE_STRING(<FROM>='some, thing', <TO>='some stuff', <FILE>=file.txt) : -> some stuff becomes somestuff
FORWARD_MODEL REPLACE_STRING(<FROM>="some, thing", <TO>="some stuff", <FILE>=file.txt) : util abort

ManInFez avatar Jan 24 '22 14:01 ManInFez

I am not sure this is possible to solve without major breaking of current config parsing.

ManInFez avatar Feb 03 '22 12:02 ManInFez

Spaces outside of quotes also behaves oddly. For instance:

  • <OPTIONAL_ARGUMENT>= stuff will be ignored and
  • <FAULTY_ARGUMENT>=part-included part-excluded will have the second part ignored.

markusdregi avatar Mar 18 '22 12:03 markusdregi

I have added these 5 examples to the test for the C function that parses these strings, and they work fine.

One of the following things could be going on:

  1. That C function (subst_list_add_from_string) is maybe not used anymore.
  2. The string is pre-processed somewhere before calling subst_list_add_from_string.
  3. The string is post-processed somewhere after calling subst_list_add_from_string.

I will try to figure out which one it is...

verveerpj avatar Mar 30 '22 10:03 verveerpj

It seems to be number 2. subst_list_add_from_string receives a modified string. Somewhere the arguments are split and then put together again, incorrectly.

verveerpj avatar Mar 30 '22 12:03 verveerpj

It turns out that when the config file is read, the lines are split by space and double quotes (") are removed. Later when a FORWARD_MODEL key is detected, the parts are simply concatenated before further processing. So, in the process double quotes and whitespace are lost.

It will be a bit of a job to fix that while making sure that nothing breaks elsewhere:

The splitting happens in config_parse__ in config_parser.cpp, using a basic parser from libecl. The re-joining happens in forward_model_parse_job_deprecated_args in forward_model.cpp.

Some options for fixing:

  1. Disable the splitting. This will require hunting down all places in the code where a split string is expected and making adjustments. A lot of work with the risk of missing some cases.
  2. Make the split smarter by only removing double quotes at the beginning and end, and afterwards re-joining with a space. This would have the effect of retaining the input string with all whitespace replaced with a single space. This solution could have unexpected side-effects within other parts of the code.
  3. ~~Replace all double quotes with single quotes before parsing, same effects as previous point, and assumes that double quotes are equivalent to single quotes.~~
  4. ~~Disallow double quotes (reporting an error), resulting in the same effect of whitespace being replaced with a single space.~~
  5. Store the raw string along with the split string and use that with the FORWARD_MODEL key instead of the re-joined string. Should be safe, but it is somewhat inelegant to drag along both raw and split strings in the configuration object.

verveerpj avatar Mar 30 '22 14:03 verveerpj

Double quotes are necessary in https://fmu-docs.equinor.com/docs/ert/reference/forward_models.html#ECL2CSV-examples in order to escape the comment characters --.

berland avatar Mar 31 '22 14:03 berland

Will be fixed as part of rewriting the config parser. This issue should be used as reference to things we need to fix as part of the work.

oysteoh avatar Sep 01 '22 07:09 oysteoh

There is a prototype for a new parser in #4720 where a test test_quotations_in_forward_model_arglist is added to document the new possible behaviour for this.

mortalisk avatar Jan 26 '23 10:01 mortalisk

This one solved now with the merge of the new parsing @mortalisk ?

oysteoh avatar Mar 08 '23 13:03 oysteoh

This will be solved when the new parser replacing the old one and is used by default. Not actionable issue

oysteoh avatar Mar 09 '23 11:03 oysteoh