Andrus Adamchik issues

Results 73 issues of


                                            Andrus Adamchik

Transformation over a subset of rows

Allow to `map` (or run whatever other type-specific ops we come up with per #77) on a subset of rows in a Series or DataFrame, and then merge the result...

Type-specific column operations

We need a way to define type-specific operations in DataFrame and Series. A good example is various String operations, such as `split` that can create multiple Series from a single...

dflib-jdbc: TableSaver to create target table if missing

`TableSaver` should optionally create the target table if it is missing. There are a few challenges to solve here: * DDL for table creation is often DB-specific (particularly column types,...

dflib-jupyter: integrate Logback, allow to change log levels in notebook

We need an easy way to change log levels when working in Jupyter: * Set the default level to `WARN` to avoid logging things like SQL statements to the console....

HTML rendering for dflib-jupyter

Currently we are using our default tabular renderer within the Jupyter environment. Let's design an HTML renderer similar to what pandas does (or better!). The main problem with the plaintext...

DataFrame.unique - produce DataFrame with non-repeating rows

A method to remove duplicate rows from the DataFrame. Allows to compare either each column in a row, or a subset of columns

Implement Cayenne-free lightweight JdbcConnector

In 99% of cases we don't really need Cayenne to read *source* data for LM jobs. A pure JDBC should suffice. It will be much more lightweight and can implement...

Extractor attributes and "dumb" extractors

I recently wrote a non-open source extractor (and a matching connector) that generates a stream of data by listing S3 bucket contents. S3 object keys are parsed by a stage...

JDBC-level pagination/streaming of large queries

When reading large datasets (e.g. inside `CreateOrUpdateTask.getRowReader()` / `JdbcExtractor.getReader()`), we are using Cayenne iterator, still the underlying DB may read the entire ResultSet in memory. This causes two problems: *...

link-move-json does not complain about invalid target attributes

In LM 2.7 I noticed that attributes mapped with incorrect "targets" are simply skipped without any warning. Should probably add something to the logs. ```xml int @.id invalidProperty ```