minbpe icon indicating copy to clipboard operation
minbpe copied to clipboard

Count only nonoverlapping occurences of a pair

Open Majdoddin opened this issue 9 months ago • 0 comments

For example, you want to count 2 (not 4) occurrences of the pair 'aa' in text 'aaaaa', because merge() can replace it just 2 times. In other words the counted occurrences should not overlap.

See the discussion in #51 sentencepease also implements it this way, per this paper (p. 3).

Majdoddin avatar May 06 '24 09:05 Majdoddin