datacompy
datacompy copied to clipboard
Fugue support for extra helper functions from core
Currently there are some helper functions as part of the core Pandas code which I think are generally very helpful. We need to spend some time exploring those and seeing which ones can be mirrored/included via the Fugue implementation.
This is just a list of most of the functions. Not all will make sense to move over. But we should investigate which ones make sense to:
- [x] df1_unq_columns (#217)
- [x] df2_unq_columns (#217)
- [x] intersect_columns (#217)
- [x] all_columns_match (#219)
- [x] all_rows_overlap (#244)
- [x] count_matching_rows (#294)
- [ ] intersect_rows_match
- [x] matches (is_match)
- [ ] subset
- [ ] sample_mismatch
- [ ] all_mismatch
- [ ] columns_equal
- [ ] compare_string_and_date_columns
Need to look into a bit more. low prio for now.
- [ ] get_merged_columns
- [ ] temp_column_name
- [ ] calculate_max_diff
- [ ] generate_id_within_group
@goodwanghan @kvnkho FYI, no pressure to contribute, but something in our backlog I'm thinking to ensure full parity in terms of function etc.
Sounds good, let's chat about it