pipetools
pipetools copied to clipboard
`join` util
I'd like to propose a new join
util.
The idea is simple: it calls foreach(str)
before calling str.join
. So, whenever you have something like
| foreach(str)
| ', '.join
you can replace it with
| join(', ')
You can customize the string conversion by passing either a function or a format string as the second parameter, e.g.:
| join(', ', lambda x: '-{}-'.format(x)) # function
| join(', ', '-{}-') # fmt string
Implementation
def join(delim, formatter=str):
'''
join(' ')
join(' ', fmtFn)
join(' ', fmtString)
'''
return foreach(formatter) | delim.join
Tests
def test_join(self):
r = [1, 2, 3] > (pipe
| join(', ')
)
self.assertEquals(r, '1, 2, 3')
def test_join_with_formatter(self):
r = [1, 2, 3] > (pipe
| join(', ', lambda x: '-{}-'.format(x))
)
self.assertEquals(r, '-1-, -2-, -3-')
def test_join_with_fmtString(self):
r = [1, 2, 3] > (pipe
| join(', ', '-{}-')
)
self.assertEquals(r, '-1-, -2-, -3-')
Hey, thanks for the suggestion!
Some things for discussion:
-
The name - I think
join
might be too generic, because it could also be used to join different sequences together (think SQL JOIN), so maybe it should bestrjoin
or something. -
Not sure about the version with formatting - just because for me:
| join(', ', '-{}-')
is much less understandable than:
| foreach('-{}-') | ', '.join
-
Would it be possible to automatically perform the string conversion if we encounter the
str.join
function? I think it should be, in which case you could just do this and it would work:[1, 2, 3] > pipe | ', '.join
And it would also solve questions 1 and 2 and look more like natural Python. Even though it might be a bit more tricky to implement.
Let me know what you think.
- The name - I think join might be too generic, because it could also be used to join different sequences together (think SQL JOIN), so maybe it should be strjoin or something.
I don't see this as a problem. It's not usual to have multiple join
s in the same namespace. IME, whenever I have a join
in my code, it's this one. It doesn't clash with SQL's join
because SQL is usually embeded in a string. Neither can it clash with str.join()
because the later is a method of str
. And, in the rare case where a clash happens, people can import the parent module to disambiguate.
Not sure about the version with formatting - just because for me:
| join(', ', '-{}-')
is much less understandable than:
| foreach('-{}-')
| ', '.join
It doesn't bother me. I used something equivalent in Java for years. Maybe it's just a matter of familiarity / getting used to it? :man_shrugging:
Would it be possible to automatically perform the string conversion if we encounter the str.join function? I think it should be, in which case you could just do this and it would work:
[1, 2, 3] > pipe | ', '.join
And it would also solve questions 1 and 2 and look more like natural Python. Even though it might be a bit more tricky to implement.
I wonder if this wouldn't be a little too magical? :thinking:
And in some cases the string conversion would happen more than once. If you perform the conversion yourself, e.g.:
| foreach('-{}-')
| ', '.join
by the time the code reached the join
part, it wouldn't know about it and would convert it again.
PS: Sorry for the huge delay in the reply.
PPS: Thank you for the project. It changed my life.
Seriously. When I first learned about it, I was like: "Where have you been my entire life?!"
:joy:
No worries, there's no rush ;) I'm glad you find the library useful!
About the join
name, I meant that it's something that could potentially also be in pipetools namespace, as a general purpose collection-joining operation.
Think something like:
In [1]: orders > join(users, X.user_id, X.id) | list
Out[1]: [([order1, order2], [user1]),
([order3], [user2])]
Maybe concat
could be used instead. Since for concatenating collections there's already chain
.
About automatically converting to string before str.join
, I wouldn't worry about double string conversion as that's not going to do anything. But I'd be more worried about silently continuing instead of failing when it's not able to produce anything useful - like with some objects that don't have __str__
implemented.