scala-library-next icon indicating copy to clipboard operation
scala-library-next copied to clipboard

Adding splitAsList to String

Open BalmungSan opened this issue 3 years ago • 4 comments

I personally always call toList after a split since I don't like using Arrays for multiple reasons.

Additionally, this implementation should be way more efficient than the current approach. And, I personally believe that semantics are more useful.

Happy to receive any kind of feedback about the implementation and, especially, about the tests.

PS: Someone also mentioned that having a splitAsIterator could also be helpful, specially for potentially large Strings Thoughts on that one?


Motivated by a personal desire for such method since a long time ago, plus some recent conversation in the discord channel.

BalmungSan avatar Jan 11 '22 17:01 BalmungSan

Re splitAsIterator: I was thinking of java.util.regex.Matcher usage. Could also be splitAsLazyList by wrapping the iterator. Something like

def splitAsIterator(pattern: String): Iterator[String] = {
  val pat = Pattern.compile(pattern)
  val mat = pat.matcher(str)

But I'm blanking on how to use mat.find to actually get it done.

s5bug avatar Jan 11 '22 18:01 s5bug

I hope we can do something in this area, as I think it's sad that to have a common task like string-splitting require Array, given that we constantly tell language newcomers not to use Array unless they're doing Java interop or super high performance work.

SethTisue avatar Jan 11 '22 18:01 SethTisue

If you call it split*, make sure it behaves exactly like String.split (i.e., Pattern.split) regarding empty matches, empty strings and empty substrings. The behavior is very touchy. You can look up the implementation (and the tests) in Scala.js: https://github.com/scala-js/scala-js/blob/b0f1dc501edb15203176c636d35323c2bcc24550/javalib/src/main/scala/java/util/regex/Pattern.scala#L168-L220

sjrd avatar Jan 11 '22 21:01 sjrd

Hi @sjrd sorry for the delay, I changed the implementation to return a List("") when the input string is empty and the preserveEmptySubStrings is set to true

BalmungSan avatar Jan 13 '22 17:01 BalmungSan