phobos icon indicating copy to clipboard operation
phobos copied to clipboard

splitter() that doesn't eat sentinels

Open dlangBugzillaToGithub opened this issue 9 years ago • 8 comments

turkeyman reported this on 2016-07-18T09:03:34Z

Transfered from https://issues.dlang.org/show_bug.cgi?id=16288

CC List

  • aronrobert293
  • greeenify
  • jack (@JackStouffer)

Description

I want a version of splitter that doesn't eat the sentinels.
I want to split AT the sentinels, but the sentinel should be the first
element of the bucket.

eg: assert(equal(splitter("hello  world", ' '), [ "hello", " ", " world" ]));

Note the weird behaviour since there are 2 spaces. More useful when
the data is not strings.

dlangBugzillaToGithub avatar Jul 18 '16 09:07 dlangBugzillaToGithub

jack (@JackStouffer) commented on 2016-07-18T20:19:29Z

Does std.regex.splitter!(Yes.keepSeparators) suffice?

dlangBugzillaToGithub avatar Jul 18 '16 20:07 dlangBugzillaToGithub

turkeyman commented on 2016-07-19T00:40:24Z

That's fine.
Does that already exist? I couldn't see anything on dlang.org.

Obviously the Pred function should remain as the first template arg, it can be second...

dlangBugzillaToGithub avatar Jul 19 '16 00:07 dlangBugzillaToGithub

jack (@JackStouffer) commented on 2016-07-19T04:00:14Z

(In reply to Manu from comment #2)
> That's fine.
> Does that already exist? I couldn't see anything on dlang.org.
> 
> Obviously the Pred function should remain as the first template arg, it can
> be second...

It's in the nightlies and on the prerelease docs.

dlangBugzillaToGithub avatar Jul 19 '16 04:07 dlangBugzillaToGithub

turkeyman commented on 2016-07-21T09:35:55Z

Wait up. I misread... you say std.regex.splitter.
No, that's not what I'm asking for. I'm interested in std.algorithm.iterator.splitter. It should be in those.

dlangBugzillaToGithub avatar Jul 21 '16 09:07 dlangBugzillaToGithub

greeenify commented on 2016-12-30T21:18:06Z

pull: https://github.com/dlang/phobos/pull/5008

dlangBugzillaToGithub avatar Dec 30 '16 21:12 dlangBugzillaToGithub

turkeyman commented on 2016-12-31T02:20:55Z

(In reply to greenify from comment #5)
> pull: https://github.com/dlang/phobos/pull/5008

Doesn't implement desired behaviour.

dlangBugzillaToGithub avatar Dec 31 '16 02:12 dlangBugzillaToGithub

greeenify commented on 2016-12-31T08:14:41Z

> Doesn't implement desired behaviour.

Fair enough - I tried to make it similar to `splitter` in `std.regex`, but it seems that even this didn't work out:

"a..b.c".splitter!(Yes.keepSeparators)(regex("[.]")).writeln

> ["a", ".", "", ".", "b", ".", "c"]

"a..b.c".splitter!(Yes.keepSeparators)('.').writeln;

> ["a", ".", ".", "b", ".", "c"]

From the example you posted, you want it to yield sth. like this, right?

> ["a", ".", ".b", ".c"]

And there's another common use case - though that one is simply splitter.filter!`a.empty`:

> ["a", "b", "c"]

However, at least the existing behavior for No.keepSeparators is the same:

"a..b.c".splitter!(No.keepSeparators)(regex("[.]")).writeln

> ["a", "", "b", "c"]

"a..b.c".splitter!(No.keepSeparators)('.').writeln;

> ["a", "", "b", "c"]

dlangBugzillaToGithub avatar Dec 31 '16 08:12 dlangBugzillaToGithub

turkeyman commented on 2020-07-18T11:40:21Z

They are all interesting cases. I think splitter should be configurable like this.

dlangBugzillaToGithub avatar Jul 18 '20 11:07 dlangBugzillaToGithub