bubbles
bubbles copied to clipboard
[NOT MAINTAINED] Bubbles – Python ETL framework
Allow use of parts of compound/indexable field types such as dates and arrays in operations. Example: ``` p.filter_by_value(FieldPart("event_date", "year"), 2013) ``` Advantages: - less steps, no need to explicit extraction...
Operations that are composed of other operations have no mechanisms to deal with consumable objects as the Pipeline and ExecutionEngine does. If an object to be consumed multiple times, the...
`join_details` should accept no keys. Columns with same names from both objects should be used as keys. Variation: only one column with same name is used, if more than one...
Operations such as `string_to_date` should use format as SQL databases use (see [PostgreSQL](http://www.postgresql.org/docs/9.2/static/functions-formatting.html#FUNCTIONS-FORMATTING-DATETIME-TABLE) for example). Reason: more human readable than the `strptime()` format with `%`'s Note that the SQL format...
Define consumable retention policy. Currently the retention is expected to be provided by the object, which is in most of the cases sub-optimal such as consuming all data into list...
Create a quick reference manual (PDF preferably) with: - list of operations - list of stores - list of object types and their representations