parser icon indicating copy to clipboard operation
parser copied to clipboard

handling of quoted text

Open missinglink opened this issue 5 years ago • 0 comments

currently, quotes are considered boundary characters with the same semantic meaning as a comma or tab.

I think it would be better to consider quoted sections as 'literal', so that no permutations are generated for these section of the text.

eg. something like 'A B C "D E F" G H' would produce permutations of: [A, B, C], [A, B], [A], [B, C], [B], [C], [D, E, F], [G, H], [G], [H] (where the inner group produced no permutations)

this can probably be achieved by recording the leading and trailing boundary character which was used to delimit each section, we can then check if BOTH the leading and trailing character are from the 'quote' class, and if so, then disable permutations for that group.

missinglink avatar May 02 '19 08:05 missinglink