lr4d
lr4d
### Problem description When working with datetime columns and using `dates_as_objects=False`, I expect null values to be `np.datetime` objects (i.e. `np.datetime64("NaT")`). However, kartothek is allowing an update with a schema...
- Make it clearer that `DatasetMetadata` can be instantiated from `DatasetFactory` (current method: `from_dataset`). Suggestion: make this the default and deprecate uuid/store combinations - `DatasetMetadataBase` and `DatasetMetadata` do not support...
# Description: The following two statements are equivalent: - `x in [1,2,3]` - `(x == 1) or (x == 2) or (x == 3)` This approach simplifies the function by...
### Problem description We use the `in` operator internally in predicate parsing, but we can just re-write the predicates to use a disjunction of `==` terms. e.g. ` [[('A', 'in',...
### Problem description Currently the expected input for cube functions is not very clear, as a first-time user. For example, for the function https://github.com/JDASoftwareGroup/kartothek/blob/4008de4436dde947faa3979f3ccf035f99c955c0/kartothek/io/dask/bag_cube.py#L451 we don't define the expected type...
### Problem description As commented by @NeroCorleone in #397: > We have seen weird IOErrors on long running ktk/dask computations that have caused incidents These errors happen while reading Parquet...
Complexity reduction. **Related issues** https://github.com/JDASoftwareGroup/kartothek/issues/213
With the intention of improving resilience, > instead of storing the intermediate results (partition metadata) on dask workers we could submit those to a more central instance (an event bus...
`DatasetMetadata` / `DatasetFactory` refactoring: - [ ] https://github.com/JDASoftwareGroup/kartothek/issues/320 - [ ] https://github.com/JDASoftwareGroup/kartothek/issues/311 - [ ] https://github.com/JDASoftwareGroup/kartothek/issues/282, specifically also https://github.com/JDASoftwareGroup/kartothek/issues/282#issuecomment-631278716 On `delete_scope` / `DatasetMetadataBase.query` (all very related): - [ ] https://github.com/JDASoftwareGroup/kartothek/issues/196...
The `DatasetFactory` still has certain design flaws and it was not intended to be used for writing paths. Primarily because it only makes sense if the dataset actually exists since...