clumper icon indicating copy to clipboard operation
clumper copied to clipboard

Helper method to nest per dictionary

Open koaning opened this issue 4 years ago • 3 comments

Let's say that I have the monopoly dataset. I have rows such as;

{'name': 'Boardwalk',
  'rent': '50',
  'house_1': '200',
  'house_2': '600',
  'house_3': '1400',
  'house_4': '1700',
  'hotel': '2000',
  'deed_cost': '400',
  'house_cost': '200',
  'color': 'blue',
  'tile': '39'}

Let's suppose that I want to change that to;

{'name': 'Boardwalk',
  'color': 'blue',
  'tile': '39',
  'costs': {'deed': '400', 'house': '200'},
  'income': {'rent': '50',
   'hotel': '2000',
   'house_1': '200',
   'house_2': '600',
   'house_3': '1400',
   'house_4': '1700'}

Then you currently need to run this:

(Clumper.read_csv("tests/data/monopoly.csv")
  .mutate(costs=lambda d: {"deed": d["deed_cost"], "house": d["house_cost"]},
          income=lambda d: {**{"rent": d["rent"], "hotel": d["hotel"]}, **{f"house_{i}": d[f"house_{i}"] for i in [1, 2, 3, 4]}})
  .drop("house_1", "house_2", "house_3", "house_4", "rent", "hotel", "deed_cost", "house_cost")
  .collect())

It feels like there should be an easier way to do it. This issue is a place where we might discuss this. Since it is a rowwise operation we might come up with a helper function for mutate but since we also want to drop the values afterwards we might be able to come up with something more general.

koaning avatar Aug 29 '20 08:08 koaning

Maybe something like:

# If you want to gather the data into a dictionary.
clump.mutate(costs=gather(d, "deed_cost", "house_cost"))

Am wondering if we can also come up with a nice inverse. Also, you still need a drop call after this...

koaning avatar Aug 29 '20 08:08 koaning

Maybe this as an inverse?

clump.mutate(spread("costs", suffix="", prefix=""))

koaning avatar Aug 29 '20 08:08 koaning

Mhm ... I suppose we might be able to use spread/gather for this as verbs. The issue is that if we want it to automatically drop values then it may be a must. Another option is to offer a nice helper function that we make available such that users can use it via pipe. This might just keep the library a lot simpler by keeping the verbs at bay.

koaning avatar Aug 29 '20 08:08 koaning