pipetools icon indicating copy to clipboard operation
pipetools copied to clipboard

`join` util

Open tfga opened this issue 4 years ago • 4 comments

I'd like to propose a new join util.

The idea is simple: it calls foreach(str) before calling str.join. So, whenever you have something like

| foreach(str)
| ', '.join

you can replace it with

| join(', ')

You can customize the string conversion by passing either a function or a format string as the second parameter, e.g.:

| join(', ', lambda x: '-{}-'.format(x))        # function
| join(', ', '-{}-')                            # fmt string

Implementation

def join(delim, formatter=str):
    '''
    join(' ')
    join(' ', fmtFn)
    join(' ', fmtString)
    '''
    
    return foreach(formatter) | delim.join

Tests

def test_join(self):
    
    r = [1, 2, 3] > (pipe
                    | join(', ')
                    )
    
    self.assertEquals(r, '1, 2, 3')
    
    
def test_join_with_formatter(self):
    
    r = [1, 2, 3] > (pipe
                    | join(', ', lambda x: '-{}-'.format(x))
                    )
    
    self.assertEquals(r, '-1-, -2-, -3-')
    
    
def test_join_with_fmtString(self):
    
    r = [1, 2, 3] > (pipe
                    | join(', ', '-{}-')
                    )
    
    self.assertEquals(r, '-1-, -2-, -3-')

tfga avatar May 28 '20 23:05 tfga

Hey, thanks for the suggestion!

Some things for discussion:

  1. The name - I think join might be too generic, because it could also be used to join different sequences together (think SQL JOIN), so maybe it should be strjoin or something.

  2. Not sure about the version with formatting - just because for me:

    | join(', ', '-{}-')
    

    is much less understandable than:

    | foreach('-{}-') 
    | ', '.join
    
  3. Would it be possible to automatically perform the string conversion if we encounter the str.join function? I think it should be, in which case you could just do this and it would work:

    [1, 2, 3] > pipe | ', '.join
    

    And it would also solve questions 1 and 2 and look more like natural Python. Even though it might be a bit more tricky to implement.

Let me know what you think.

0101 avatar May 30 '20 22:05 0101

  1. The name - I think join might be too generic, because it could also be used to join different sequences together (think SQL JOIN), so maybe it should be strjoin or something.

I don't see this as a problem. It's not usual to have multiple joins in the same namespace. IME, whenever I have a join in my code, it's this one. It doesn't clash with SQL's join because SQL is usually embeded in a string. Neither can it clash with str.join() because the later is a method of str. And, in the rare case where a clash happens, people can import the parent module to disambiguate.

  1. Not sure about the version with formatting - just because for me:

    | join(', ', '-{}-')

    is much less understandable than:

    | foreach('-{}-')
    | ', '.join

It doesn't bother me. I used something equivalent in Java for years. Maybe it's just a matter of familiarity / getting used to it? ‍:man_shrugging:

  1. Would it be possible to automatically perform the string conversion if we encounter the str.join function? I think it should be, in which case you could just do this and it would work:

    [1, 2, 3] > pipe | ', '.join

And it would also solve questions 1 and 2 and look more like natural Python. Even though it might be a bit more tricky to implement.

I wonder if this wouldn't be a little too magical? :thinking:

And in some cases the string conversion would happen more than once. If you perform the conversion yourself, e.g.:

| foreach('-{}-')
| ', '.join

by the time the code reached the join part, it wouldn't know about it and would convert it again.

tfga avatar Oct 28 '20 21:10 tfga

PS: Sorry for the huge delay in the reply.

PPS: Thank you for the project. It changed my life.

Seriously. When I first learned about it, I was like: "Where have you been my entire life?!"
:joy:

tfga avatar Oct 30 '20 19:10 tfga

No worries, there's no rush ;) I'm glad you find the library useful!

About the join name, I meant that it's something that could potentially also be in pipetools namespace, as a general purpose collection-joining operation.

Think something like:

In [1]: orders > join(users, X.user_id, X.id) | list

Out[1]: [([order1, order2], [user1]), 
         ([order3], [user2])]

Maybe concat could be used instead. Since for concatenating collections there's already chain.

About automatically converting to string before str.join, I wouldn't worry about double string conversion as that's not going to do anything. But I'd be more worried about silently continuing instead of failing when it's not able to produce anything useful - like with some objects that don't have __str__ implemented.

0101 avatar Nov 02 '20 21:11 0101