returns icon indicating copy to clipboard operation
returns copied to clipboard

Add transducers support

Open sobolevn opened this issue 5 years ago • 23 comments

https://medium.com/javascript-scene/transducers-efficient-data-processing-pipelines-in-javascript-7985330fe73d https://dev.to/romanliutikov/understanding-transducers-in-javascript-4pdg

Currently we use very straight-to-action helpers. Sometimes we might need more efficient ones. Here why we need transducers.

sobolevn avatar Sep 16 '20 06:09 sobolevn

https://ramdajs.com/docs/#expand

thepabloaguilar avatar Apr 25 '21 18:04 thepabloaguilar

https://github.com/cognitect-labs/transducers-python

thepabloaguilar avatar Apr 25 '21 19:04 thepabloaguilar

We can implement based in the link above, it was inspired by Clojure implementation that has some cool features like early termination using Reduced sentinel!

In fact almost everything there can be translated to returns using typing, specially the transducers parts! I've written the transducers for map and filter:

def mapping(
    function: Callable[[_T], _T],
) -> Callable[
    [Callable[[List[_T], _T], List[_T]]],
    Callable[[List[_T], _T], List[_T]],
]:
    def reducer(
        reducing: Callable[[List[_T], _T], List[_T]],
    ) -> Callable[[List[_T], _T], List[_T]]:
        def map_(
            collection: List[_T],
            item: _T,
        ) -> List[_T]:
            return reducing(collection, function(item))
        return map_
    return reducer

def filtering(
    predicate: Callable[[_T], bool],
) -> Callable[
    [Callable[[List[_T], _T], List[_T]]],
    Callable[[List[_T], _T], List[_T]],
]:
    def reducer(
        reducing: Callable[[List[_T], _T], List[_T]],
    ) -> Callable[[List[_T], _T], List[_T]]:
        def filter_(
            collection: List[_T],
            item: _T
        ) -> List[_T]:
            if predicate(item):
                return reducing(collection, item)
            return collection
        return filter_
    return reducer

P.S.: The types in my code are wrong yet 'cause transducers should work with any kind of input and there I'm fixing List[_T]

thepabloaguilar avatar Apr 25 '21 19:04 thepabloaguilar

Great stuff!

sobolevn avatar Apr 25 '21 20:04 sobolevn

Maybe is because I'm so tired now after many hours in front of a computer, but I can't figure out how to solve the problem I'm getting.

Transducer map implementation:

_ValueType = TypeVar('_ValueType')
_NewValueType = TypeVar('_NewValueType')

_AccType = TypeVar('_AccType')
_NewAccType = TypeVar('_NewAccType')


def tmap(
    function: Callable[[_ValueType], _NewValueType],
) -> Callable[
    [Callable[[_AccType, _NewValueType], _AccType]],
    Callable[[_AccType, _ValueType], _AccType],
]:
    def reducer(
        reducing: Callable[[_AccType, _NewValueType], _AccType],
    ) -> Callable[[_AccType, _ValueType], _AccType]:
        def map_(acc: _AccType, value: _ValueType) -> _AccType:
            return reducing(acc, function(value))
        return map_
    return reducer

Every type seems correct:

  • function should be the function we normally pass to a map, like, lambda item: str(item)
    • Callable[[_ValueType], _NewValueType]
  • reducing is the function to reduce/next step, like, lambda acc, value: acc + value
    • Callable[[_AccType, _NewValueType], _AccType]
  • The others are basic types, should be correct as well

But when I try to make a composition I got a error, but it should work!

Test case:

from typing import List
from returns.transducers import tmap

def to_str(number: int) -> str:
    ...

f_step = tmap(to_str)
second_step = f_step(tmap(to_str))

Test output:

E   Actual:
E     main:14: error: Argument 1 has incompatible type "Callable[[Callable[[_AccType, Any], _AccType]], Callable[[_AccType, Any], _AccType]]"; expected "Callable[[Any, str], Any]" (diff)
E     main:14: error: Cannot infer function type argument (diff)

Like, Any shouldn't be there.


Other test cases works fine:

from returns.transducers import tmap

def to_str(number: int) -> str:
    ...

reveal_type(tmap(to_str))  # N: Revealed type is 'def [_AccType] (def (_AccType`-3, builtins.str*) -> _AccType`-3) -> def (_AccType`-3, builtins.int*) -> _AccType`-3'
from typing import List
from returns.transducers import tmap

def to_str(number: int) -> str:
    ...

def append(collection: List[str], item: str) -> List[str]:
    ...

reveal_type(tmap(to_str)(append))  # N: Revealed type is 'def (builtins.list*[builtins.str], builtins.int) -> builtins.list*[builtins.str]'
from typing import List
from returns.transducers import tmap

def to_str(number: int) -> str:
    ...

def append(collection: List[str], item: str) -> List[str]:
    ...

my_list: List[str]
reveal_type(tmap(to_str)(append)(my_list, 2))  # N: Revealed type is 'builtins.list*[builtins.str]'

thepabloaguilar avatar Apr 26 '21 04:04 thepabloaguilar

I'd love to know if you have any tips @sobolevn!!

thepabloaguilar avatar Apr 26 '21 04:04 thepabloaguilar

Btw, the actual implementation is working normally:

>>> def append(a, v):
...     a.append(v)
...     return a

>>> tmap(str)(tmap(int)(append))([], 1)
[1]

thepabloaguilar avatar Apr 26 '21 04:04 thepabloaguilar

Ok, I think the composition I've made is wrong! I'll see tomorow

thepabloaguilar avatar Apr 26 '21 05:04 thepabloaguilar

@sobolevn I remember a discussion from #451, what will be a great name for this feature?

Options:

  • Transducers (no change the name)
  • Transform
  • Reducees (from Elixir, tks @bruno-delfino1995 for the link)
  • (Other options you can suggest)

thepabloaguilar avatar Apr 26 '21 18:04 thepabloaguilar

Ok, I think the composition I've made is wrong! I'll see tomorow

Well, my composition was wrong and my thought too hahaha

The types are working correctly with the composition:

>>> from returns.transducers import tmap
>>> append = lambda a, v: a.append(v) or a
>>> tmap(int)(tmap(str)(append))([], '1')
['1']

thepabloaguilar avatar Apr 27 '21 03:04 thepabloaguilar

@sobolevn You can see the progress in this draft PR (yet) or in a very preview documentation here

thepabloaguilar avatar Apr 27 '21 06:04 thepabloaguilar

I have a suggestion, in the actual implementation way the user need to build "everything" by hand like:

xform = compose(
    tmap(func),
    tfilter(func),
)

result = transduce(
    xform,
    reducing_func,
    initial,
    data_to_process,
)

My ideia is to incapsulate in a class:

t = (
    Transformation
        .map(func)
        .filter(func)
)

result = t(
    reducing_func,
    initial,
    data_to_process,
)

Using a class makes clear to the users that they are receiving a transformation instead of a random function!

thepabloaguilar avatar Apr 27 '21 14:04 thepabloaguilar

Yeah, why not!

sobolevn avatar Apr 27 '21 14:04 sobolevn

Well, after some tests I can't find a great solution design 😞 The user most provide a function, and this function can be different (map != filter) My idea now is to provide .from_(map, filter) method to initialize the instance, but I think it is not worth it! For each transducer function we will need to add an initialization method.

Do you have any suggestion @sobolevn?? If not, I'll continue with the normal solution

thepabloaguilar avatar Apr 28 '21 03:04 thepabloaguilar

For each transducer function we will need to add an initialization method.

Sorry, I don't understand. Can you please show me the code?

sobolevn avatar Apr 28 '21 06:04 sobolevn

Sure, but think in our Result container where we have .from_value(...), .from_failure(...). It's the same concept but will we have more than two from_*:

class Transformation(Generic[_AccValueType, _ValueType]):
    def map(
        self,
        function: Callable[[_ValueType], _NewValueType])
    ) -> 'Transformation[_AccValueType, _NewValueType]':
        ...

    @classmethod
    def from_map(
        cls,
        function: Callable[[_ValueType], _NewValueType])
    ) -> 'Transformation[_AccValueType, _NewValueType]':
        ...

    def filter(
        self,
        function: Callable[[_ValueType], _ValueType])
    ) -> 'Transformation[_AccValueType, _ValueType]':
        ...

    @classmethod
    def from_filter(
        cls,
        function: Callable[[_ValueType], _ValueType])
    ) -> 'Transformation[_AccValueType, _ValueType]':
        ...

    # OTHERS TRANSDUCERS THAT HAVE DIFFERENT FUNCTIONS SIGNATURES

thepabloaguilar avatar Apr 28 '21 13:04 thepabloaguilar

Oh, wait. I think I cannot do that, the idea behind transducers is to remove the rule from the process. Transducers don't care about the user input or output. If the users make Transformations[List[str], int] they will be putting this rule again! 🤔

thepabloaguilar avatar Apr 28 '21 14:04 thepabloaguilar

This is what I will do, continue with the first approach and then I can think about improvements!

thepabloaguilar avatar May 02 '21 19:05 thepabloaguilar

@sobolevn I need your help a little, I'm planning to finish the implementation this weekend!

We have the transduce function right here:

def transduce(
    xform: Callable[
        [Callable[[_AccValueType, _ValueType], _AccValueType]],
        Callable[[_AccValueType, _ValueType], _AccValueType],
    ],
    reducing_function: Callable[[_AccValueType, _ValueType], _AccValueType],
    initial: _AccValueType,
    iterable: Iterable[_ValueType],
) -> _AccValueType:
    ...

Sometimes I just want to pass an empty list to the initial value, like:

transduce(
    xform=tfilter(is_even),
    reducing_function=append,
    initial=[],
    iterable=my_list,
)

And I receive this error from mypy:

error: Argument "reducing_function" to "transduce" has incompatible type "Callable[[List[int], int], List[int]]"; expected "Callable[[List[<nothing>], int], List[<nothing>]]"

To fix this issue I need to type hint that list:

initial_list: List[int]

transduce(
    xform=tfilter(is_even),
    reducing_function=append,
    initial=initial_list,
    iterable=my_list,
)

Do you know a way to avoid that??

thepabloaguilar avatar May 08 '21 02:05 thepabloaguilar

Do you know a way to avoid that??

No, this is pretty annoying bug in how mypy works, the thing is: [] is sometimes List[never] and sometimes List[any]. Right now it is not working this way 😞

sobolevn avatar May 08 '21 07:05 sobolevn

Sad 😭 😭 So, I'll continue as is!

thepabloaguilar avatar May 09 '21 05:05 thepabloaguilar

I'm continuing this issue and after a lot of thoughts I end up deciding that transducers is a cool thing but we don't need them in Python in the same way that's implemented in Clojure! The thing is, Python has generators which chained result in the same behavior of the transducer, instead of creating a lot of intermediate iterable they process each element from the start to the end!

>>> l = [1, 2, 3, 4, 5]
>>> def mapf(num: int) -> int:
...     print(num)
...     return num
...
>>> def filterf(num: int) -> bool:
...     print(num)
...     return True
...
>>> list(filter(filterf, map(mapf, l)))
1
1
2
2
3
3
4
4
5
5
[1, 2, 3, 4, 5]

But I do think it's useful for us to provide a better tooling for generators like we have on Clojure transducers! So my idea is to implement that tolling now!

WDYT @sobolevn??

thepabloaguilar avatar Aug 30 '22 04:08 thepabloaguilar

The idea sounds interesting! Please, feel free to send a prototype.

sobolevn avatar Aug 30 '22 08:08 sobolevn