regexp2
regexp2 copied to clipboard
Is there any workaround for `split`?
Thanks for this nice library!
I'm using this library from another language that can compile to Golang.
I've now finally hit the case where I use a library that needs split on regex. You mention in the README that this you're still working on this. Do you happen to have a draft or other unfinished code that can do some splitting (maybe slow, maybe wrong in edge cases)?
I had written a split function (based on C#) for the code-gen version of the library. I suspect it'll work with the main version as well, but there are probably edge cases:
// Split splits the given input string using the pattern and returns
// a slice of the parts. Count limits the number of matches to process.
// If Count is -1, then it will process the input fully.
// If Count is 0, returns nil. If Count is 1, returns the original input.
// The only expected error is a Timeout, if it's set.
//
// If capturing parentheses are used in the Regex expression, any captured
// text is included in the resulting string array
// For example, a pattern of "-" Split("a-b") will return ["a", "b"]
// but a pattern with "(-)" Split ("a-b") will return ["a", "-", "b"]
func (re *Regexp) Split(input string, count int) ([]string, error) {
if count < -1 {
return nil, errors.New("count too small")
}
if count == 0 {
return nil, nil
}
if count == 1 {
return []string{input}, nil
}
if count == -1 {
// no limit
count = math.MaxInt64
}
// iterate through the matches
priorIndex := 0
var retVal []string
var txt []rune
m, err := re.FindStringMatch(input)
for ; m != nil && count > 0; m, err = re.FindNextMatch(m) {
txt = m.text
// if we have an m, we don't have an err
// append our match
retVal = append(retVal, string(txt[priorIndex:m.Index]))
// append any capture groups, skipping group 0
gs := m.Groups()
for i := 1; i < len(gs); i++ {
retVal = append(retVal, gs[i].String())
}
priorIndex = m.Index + m.Length
count--
}
if err != nil {
return nil, err
}
if txt == nil {
// we never matched, return the original string
return []string{input}, nil
}
// append our remainder
retVal = append(retVal, string(txt[priorIndex:]))
return retVal, nil
}
It uses the m.txt private field, but I'm sure it could be written without it for your purposes. Let me know if you run into any issues. I could look at adding this to the main library version.
@i-am-the-slime did this help?
@dlclark Yes, very much so, thanks!