haystack
haystack copied to clipboard
Pipeline validation fails for PreProcessor initialised with strings containing spaces for the remove_substrings param
Describe the bug
When using the PreProcessor with the remove_substrings parameter with a configuration where the PreProcessor should remove substrings with a space in the string, pipeline validation fails if using load_from_config or load_from_yaml.
Error message
raise PipelineConfigError(
haystack.errors.PipelineConfigError: 'with space' is not a valid variable name or value. Use alphanumeric characters or dash, underscore and colon only.)
This is also true for any parameter value in custom nodes which is very limiting regarding what you can implement.
Expected behavior Pipeline validation does not care about the formatting of the values that are passed in. Checking variable names is fine but values shouldn't be checked.
Additional context
To Reproduce
from haystack.pipelines import Pipeline
if __name__ == '__main__':
pipeline_cfg = {
'version': '1.8',
'components': [
{
'name': 'preprocessor',
'type': 'PreProcessor',
'params': {
'remove_substrings': ['with space']
}
}
],
'pipelines': [
{
'name': 'indexing',
'nodes': [
{'name': 'preprocessor', 'inputs': ['File']}
]
}
]
}
pp = Pipeline.load_from_config(pipeline_cfg, pipeline_name='indexing')
FAQ Check
- [x ] Have you had a look at our new FAQ page?
System:
- OS:
- GPU/CPU:
- Haystack version (commit or version number): 1.8
- DocumentStore:
- Reader:
- Retriever: