framework icon indicating copy to clipboard operation
framework copied to clipboard

Table aggregate does not work with len.

Open shashigharti opened this issue 3 years ago • 2 comments

Overview

"table-aggregate" step when used with len doesn't work.

source = Resource(path="784/transform.csv")
target = transform(
    source,
    steps=[
        steps.table_normalize(),
        steps.table_aggregate(
            group_name="name", aggregation={"min": ("population", len)}
        ),
    ],
)
print(target.schema)
print(target.to_view())

FrictionlessException: [step-error] Step is not valid: "table_aggregate" raises "object of type 'generator' has no len()"

shashigharti avatar Nov 07 '22 11:11 shashigharti

Hi @roll,

the error is because we are trying to apply len function to generator object. The solution is to pass all the data at once, but I am not sure if that is a good solution. And also for that, we have to directly use Table instance of petl (petl.Table). What do you suggest?

shashigharti avatar Nov 18 '22 12:11 shashigharti

Let's park it for now and we will handle it for v6

roll avatar Nov 21 '22 08:11 roll

The right way to get the length for each group is:

table.csv

id,name,population
1,germany,83
2,france,66
3,spain,47
4,germany,44
5,france,80
6,spain,9
from frictionless import Resource, transform, steps

source = Resource(path="transform.csv")
target = transform(
    source,
    steps=[
        steps.table_normalize(),
        steps.table_aggregate(
            group_name="name", aggregation={
                    "len": len
                },
        ),
    ],
)

output

{'fields': [{'name': 'name', 'type': 'string'}, {'name': 'len', 'type': 'any'}]}
+-----------+-----+
| name      | len |
+===========+=====+
| 'france'  |   2 |
+-----------+-----+
| 'germany' |   2 |
+-----------+-----+
| 'spain'   |   2 |
+-----------+-----+

shashigharti avatar Jun 19 '23 01:06 shashigharti