python-benedict icon indicating copy to clipboard operation
python-benedict copied to clipboard

Add function SuperFlatten feature, Flatten multi-level dictionary to 1

Open NOBB2333 opened this issue 11 months ago • 3 comments

When using the flat feature, dictionaries and lists with multiple levels of nested dictionary types are found to be truncated to the dictionary, and the list will not go further down I also looked at the function 'benedict/core/flatten. py' and did a super expansion, even expanding the dictionary list and nesting it together

from pprint import pprint
from benedict import benedict
import json

# 原始的多层级字典和列表嵌套数据
complex_data = {
    'person': {
        'name': 'John',
        'age': 30,
        'addresses': [
            {'street': '123 Main St', 'city': 'New York'},
            {'street': '456 Elm St', 'city': 'San Francisco'}
        ]
    },
    'company': {
        'name': 'ABC Inc.',
        'employees': [
            {'name': 'Alice', 'department': 'HR'},
            {'name': 'Bob', 'department': 'Engineering'}
        ]
    }
}

# 将复杂数据扁平化
flat_data = benedict(complex_data).flatten(separator='_')

# 打印扁平化后的数据
print('Flattened Data:')
pprint(dict(flat_data))


# Flattened Data:
# {'company_employees': [{'name': 'Alice', 'department': 'HR'},
#                        {'name': 'Bob', 'department': 'Engineering'}],
#  'company_name': 'ABC Inc.',
#  'person_addresses': [{'street': '123 Main St', 'city': 'New York'},
#                       {'street': '456 Elm St', 'city': 'San Francisco'}],
#  'person_age': 30,
#  'person_name': 'John'}`

by using SuperFlatten
list item will concat by "\n ___Json结果展开解析.txt "


[{'company_employees__department': 'HR\nEngineering',
  'company_employees__name': 'Alice\nBob',
  'company_name': 'ABC Inc.',
  'person_addresses__city': 'New York\nSan Francisco',
  'person_addresses__street': '123 Main St\n456 Elm St',
  'person_age': '30',
  'person_name': 'John'}]

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar

NOBB2333 avatar Mar 08 '24 09:03 NOBB2333

@NOBB2333 thank you for this suggestion. I'm open to add support this feature if it matches all the requirements below:

  • [ ] this feature should be optional and backward-compatible, the idea thing is to support a new argument option for using it, eg. deep=True
  • [ ] list items should be flattened like mydict.mylist[0].mysublist[0] and so on ...
  • [ ] list items must not be concatenated by any separator, this seems to be really a personal/project-specific need.
  • [ ] the unflatten method must be updated too, doing unflatten(flatten(dict)) should return the initial dict.
  • [ ] add a couple of test cases to avoid regressions

Do you want to work on a Pull Request? That would be great!

fabiocaccamo avatar Mar 12 '24 15:03 fabiocaccamo

I think your suggestion is very good,i still got some question:

  • deep=True is ok
  • mydict.mylist[0].mysublist[0] infact pandas using is this , why not use this because make data structured more difficult to use , ofcourse one of my request is use to export xlsx, it be very long and cant read;
  • i haven't understand this ,if use last require mydict.mylist[0].mysublist[0] ,quite is need't
  • unflatten has not make that yet, and recent i got some problem with parse, i want some help
  • test cases when all complect will add

abount send Problem clarification

why i choose using "\n" to connect the list objct ,infact is not connect ,just like i say, use mydict.mylist[0].mysublist[0] wen a list is too long is not good to use , my sloution is wen list is the lase level object , child object is pure dict object , the meaning they a the same ,example: company have 50 employee, if use ing this , parse and using also will be easy, mydict.mylist[0].mysublist[0] willbe vety long , if not the lase level , won't use this connect ,

problem

i want get help

  • because i using json.loads i find some of them will parse by object, sunch as True False None null recent i use way is : before loads use data.replace("True",' "True" ') , i met some some icant slove, wen thekey ,vlauealso have a word contains the one of them , parse will be error , example : {"keu ":"the result is a True value"} have several day didin't slove, so when "\n".join( list ) always got error , xxxxobject can connect with str , i try to got some inspiration from here

after that, the may relly is Niche demand ,this is my project info: the reason is this , when using other service , always return a json, this json is not fromat, example with https://openapi.qcc.com/dataApi/213 a search company info website: i add one list object in json_data['Result']['PartnerList']['KeyNo'] make it more real, in the ___Json结果展开解析.txt-> dict json_data ChangeList got 4 result , but json_data['Result']['PartnerList']['KeyNo'] got 2 , if need save the data, save all json to DB, or parse again , so t use parse last level list to connect by "\n"

and so on, i'm enjoy with Pull Request ,

NOBB2333 avatar Mar 17 '24 04:03 NOBB2333

@NOBB2333 I'm sorry, but I can't understand well what you mean, could you try using Google Translate or a similar service please?

fabiocaccamo avatar Mar 19 '24 16:03 fabiocaccamo