php-unstructured-text-parser icon indicating copy to clipboard operation
php-unstructured-text-parser copied to clipboard

Template matching - Needs to be exact character position?

Open ConnectGrid opened this issue 1 year ago • 3 comments

How closely does the template need to match the original text? Perfectly, or are whitespace differences ignored?

For example, my template might look like this:

Name: {%name}

and my parsed text like this:

Name:           Charlie Brown

Would it fail to match because of the difference in whitespace?

That's what I'm currently observing.

ConnectGrid avatar Jan 16 '24 23:01 ConnectGrid

Alright, discovered one issue.

My template file has a trailing line break at the end (common in the programming world):

Name: {%name}

(note the trailing blank line)

Passing in a string without this trailing line break causes the match to fail. Adding a line break causes the match to be successful.

So maybe we need the library to trim off leading and trailing whitespace? Would that whitespace ever be important for matching? If so, could an explicit pattern be used to capture it when needed?

Investigation continues on why this whitespace difference exists in my test files and templates.

ConnectGrid avatar Jan 17 '24 00:01 ConnectGrid

Confirmed. Whitespace within lines and at the end of lines do influence the matching. Even a single space can cause a mismatch. Hmmm.... gonna need to find a more relaxed approach.

ConnectGrid avatar Jan 17 '24 00:01 ConnectGrid

Sorry for getting back too late on this @ConnectGrid, before investigating this further I wanted to check with you if it's still an issue and how did you deal with it if you continued to use the parser ?

I also noticed your template's variable is missing a percentage character at the end, it should rather be Name: {%name%} but I am assuming that's a typo in the issue here, or else it wouldn't have matched the value anywhere, with and without spaces.

Let me know the status and your findings and I will have a look at this.

aymanrb avatar Aug 26 '24 06:08 aymanrb