datacompy icon indicating copy to clipboard operation
datacompy copied to clipboard

Fugue support for extra helper functions from core

Open fdosani opened this issue 2 years ago • 2 comments

Currently there are some helper functions as part of the core Pandas code which I think are generally very helpful. We need to spend some time exploring those and seeing which ones can be mirrored/included via the Fugue implementation.

This is just a list of most of the functions. Not all will make sense to move over. But we should investigate which ones make sense to:

  • [x] df1_unq_columns (#217)
  • [x] df2_unq_columns (#217)
  • [x] intersect_columns (#217)
  • [x] all_columns_match (#219)
  • [x] all_rows_overlap (#244)
  • [x] count_matching_rows (#294)
  • [ ] intersect_rows_match
  • [x] matches (is_match)
  • [ ] subset
  • [ ] sample_mismatch
  • [ ] all_mismatch
  • [ ] columns_equal
  • [ ] compare_string_and_date_columns

Need to look into a bit more. low prio for now.

  • [ ] get_merged_columns
  • [ ] temp_column_name
  • [ ] calculate_max_diff
  • [ ] generate_id_within_group

fdosani avatar Jun 06 '23 14:06 fdosani

@goodwanghan @kvnkho FYI, no pressure to contribute, but something in our backlog I'm thinking to ensure full parity in terms of function etc.

fdosani avatar Jun 19 '23 21:06 fdosani

Sounds good, let's chat about it

goodwanghan avatar Jun 21 '23 16:06 goodwanghan