Andrus Adamchik

Results 73 issues of Andrus Adamchik

Allow to `map` (or run whatever other type-specific ops we come up with per #77) on a subset of rows in a Series or DataFrame, and then merge the result...

We need a way to define type-specific operations in DataFrame and Series. A good example is various String operations, such as `split` that can create multiple Series from a single...

`TableSaver` should optionally create the target table if it is missing. There are a few challenges to solve here: * DDL for table creation is often DB-specific (particularly column types,...

We need an easy way to change log levels when working in Jupyter: * Set the default level to `WARN` to avoid logging things like SQL statements to the console....

Currently we are using our default tabular renderer within the Jupyter environment. Let's design an HTML renderer similar to what pandas does (or better!). The main problem with the plaintext...

A method to remove duplicate rows from the DataFrame. Allows to compare either each column in a row, or a subset of columns

In 99% of cases we don't really need Cayenne to read *source* data for LM jobs. A pure JDBC should suffice. It will be much more lightweight and can implement...

I recently wrote a non-open source extractor (and a matching connector) that generates a stream of data by listing S3 bucket contents. S3 object keys are parsed by a stage...

When reading large datasets (e.g. inside `CreateOrUpdateTask.getRowReader()` / `JdbcExtractor.getReader()`), we are using Cayenne iterator, still the underlying DB may read the entire ResultSet in memory. This causes two problems: *...

In LM 2.7 I noticed that attributes mapped with incorrect "targets" are simply skipped without any warning. Should probably add something to the logs. ```xml int @.id invalidProperty ```