flatten icon indicating copy to clipboard operation
flatten copied to clipboard

Option to ignore fields and specify certain fields to flatten

Open mcarans opened this issue 8 years ago • 4 comments

It would be great to be able to specify fields not to flatten as sometimes there can be one or two fields that when expanded produce a huge number of columns.

It would also be great to be able to do the opposite: ie. only specify certain fields to flatten, the rest being left as is.

Related would be an option for lists to flatten up to n elements in the list and ignore the rest (or put rest unflattened in one column perhaps).

Thx for a great utility.

mcarans avatar Mar 09 '17 07:03 mcarans

@mcarans sounds like a great idea and should be relatively easy to implement. I'll take a stab at it over the weekend. Feel free to do a PR if you want to.

amirziai avatar Mar 09 '17 07:03 amirziai

That's great @amirziai I look forward to trying it. One more option I just thought of is the facility to just drop certain fields altogether and possibly the opposite - specify a list of fields to output, so the full list:

  1. List of fields to include in the output
  2. List of fields to exclude from the output
  3. List of fields to flatten
  4. List of fields to not flatten
  5. Specify n where n is the number of elements to flatten in a field containing a list and either: a. Put the rest unflattened in one column b. Don't output the rest

In case you are interested, I am running this script to extract data from the Humanitarian Data Exchange (HDX).

mcarans avatar Mar 09 '17 09:03 mcarans

How about passing a custom function which acts as a filter for each element? Something like:

def is_exported(name):
  return name !== 'exclude'
...
result =  flatten({}, filter=is_exported)

aquilax avatar Mar 09 '17 11:03 aquilax

@aquilax I think there would need to be 2 filters, one for inclusion/exclusion (1 and 2) and another for flattening/not flattening (3 and 4) and then another parameter for 5 eg. something like: result = flatten({}, included_filter=is_included, flattened_filter=is_flattened, maxlistsize) They could have defaults (ie. include everything, flatten everything, no max list size).

I am open as to the best way to do it, just that the options 1-5 above are ideally possible.

mcarans avatar Mar 09 '17 12:03 mcarans