framework icon indicating copy to clipboard operation
framework copied to clipboard

Data management framework for Python that provides functionality to describe, extract, validate, and transform tabular data

Results 245 framework issues
Sort by recently updated
recently updated
newest added

# Overview We have some functions that require collecting data in memory like: - `checks.duplicate_row` - `checks.deviated_cell/value` - `resource.analyze` - etc We might provide an internal cache system (switching to...

enhancement

# Overview Parallelization can be added to some steps/etc

feature

# Overview "table-aggregate" step when used with len doesn't work. ``` source = Resource(path="784/transform.csv") target = transform( source, steps=[ steps.table_normalize(), steps.table_aggregate( group_name="name", aggregation={"min": ("population", len)} ), ], ) print(target.schema) print(target.to_view())...

bug

# Overview We need an ability to save metadata + data (package + all resources)

feature

# Overview As a part of v6's transform work. Probably we need to make it immutable (proxy for cells) for performance

general

# Overview We need an ability to save metadata + data

feature

# Overview At the moment, it doesn't match. Shall we normalize line endings etc? It's complicated because `python.csv` requires opening files without a universal newline. On the other hand, the...

bug

# Overview @pwalsh has wrote > sleep: > > it is a killer if you can't force a sleep between runs. This was a crude way to work around API...

feature

# Overview The migration from `tabulator/tableschema/datapackage/goodtables` gave good speed improvement but we still can make it faster especially for working with numbers - https://github.com/frictionlessdata/frictionless-py/issues/461 # Tasks - [ ] create...

general

# Overview We only need to wrap corresponding PETL's functions.

feature