php-unstructured-text-parser
php-unstructured-text-parser copied to clipboard
Template matching - Needs to be exact character position?
How closely does the template need to match the original text? Perfectly, or are whitespace differences ignored?
For example, my template might look like this:
Name: {%name}
and my parsed text like this:
Name: Charlie Brown
Would it fail to match because of the difference in whitespace?
That's what I'm currently observing.
Alright, discovered one issue.
My template file has a trailing line break at the end (common in the programming world):
Name: {%name}
(note the trailing blank line)
Passing in a string without this trailing line break causes the match to fail. Adding a line break causes the match to be successful.
So maybe we need the library to trim
off leading and trailing whitespace? Would that whitespace ever be important for matching? If so, could an explicit pattern be used to capture it when needed?
Investigation continues on why this whitespace difference exists in my test files and templates.
Confirmed. Whitespace within lines and at the end of lines do influence the matching. Even a single space can cause a mismatch. Hmmm.... gonna need to find a more relaxed approach.
Sorry for getting back too late on this @ConnectGrid, before investigating this further I wanted to check with you if it's still an issue and how did you deal with it if you continued to use the parser ?
I also noticed your template's variable is missing a percentage character at the end, it should rather be Name: {%name%}
but I am assuming that's a typo in the issue here, or else it wouldn't have matched the value anywhere, with and without spaces.
Let me know the status and your findings and I will have a look at this.