desbordante-core
desbordante-core copied to clipboard
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algor...
Fixes floating point comparison errors when comparing values with parameter, adds a test case to cover this issue. Fixes potential issue with access to highlight_calculator before it is created.
Add methods for separator validation in .csv tables
Added generic destruct method Type::Destruct. Added method Destruct(TypeId, std::byte* ) to typed_column_data and mixed_type (maybe putting this method to a separate header file to avoid code duplication is a better...
This regex doesn't match `1.1E10` or `1E+10`, which are valid numbers in scientific notation. So if we're going to support scientific notation, let's support it fully (and add corresponding tests)....
The reason we have a separate option `Table` for CSV tables is because the table can be specified for the algorithm not only as path, but also as pandas dataframe....
Added dates support to AC. Added test for AC on the dataset with dates. Added dates to python example.
In the case of functional dependencies in Desbordante, it is possible to explicitly set a limit on the number of attributes on the left side of the dependency (`--max_lhs`). Hovewer,...
See [the comment](https://github.com/Mstrutov/Desbordante/pull/295#discussion_r1404541005) for more details.