treelib icon indicating copy to clipboard operation
treelib copied to clipboard

How to order by data?

Open daphnawegner opened this issue 7 years ago • 22 comments

Is there a good way to order the dictionary by a property found in the data properties?

I am trying to order the children of each node according to the volume field in the data dict

 "Root": {
            "data": {
                "volume": 88829.0,
                "display_name": "Root",
            },
            "children": [
                {
                    "softw": {
                        "data": {
                            "volume": 82229.0,
                            "display_name": "software",
                        },
                        "children": [
                            {
                                "best": {
                                    "data": {
                                        "volume": 139.0,
                                        "display_name": "best",
                                    },
                                    "children": [
                                        {
                                            "songwrit_139": {
                                                "data": {
                                                    "volume": 139.0,
                                                    "display_name": "songwrite", 
                                                }
                                            }
                                        }
                                    ]
                                }
                            },

daphnawegner avatar Nov 10 '16 18:11 daphnawegner

I have tried doing something like this but it didn't change the order of the nodes in the tree:

def sort_tree(tree,node):
    if len(tree.children(node.identifier)) == 0:
        return None
    else:
        children = tree.children(node.identifier)
        for child in children:
            children.sort(key=lambda node: node.data["volume"], reverse=True)
            sort_tree(tree,child)

    return None

daphnawegner avatar Nov 10 '16 18:11 daphnawegner

You know the tree nodes are stored as Python dict, which only reserves hashed order. It cann't be ordered on the tree per se. I think there is a small mistake in ur code. children is a local variable as a list returned by tree.children. Its sorted and then dropped before entering next iteration.

caesar0301 avatar Nov 11 '16 02:11 caesar0301

A potential solution. Node children are stored as a list in its _fpointer. You can implement a tree.sort_children() to get sorted _fpointer permanently.

caesar0301 avatar Nov 11 '16 03:11 caesar0301

yes i see what you mean but doing this doesn't sort either

def sort_tree(tree,node):
    if len(tree.children(node.identifier)) == 0:
        return None
    else:
        children = tree.children(node.identifier)
        for child in children:
            tree.children(node.identifier).sort(key=lambda node: node.data["volume"], reverse=True)
            sort_tree(tree,child)

    return None

I am actually entering the nodes in the order I want them to be but for some reason when I print the tree I see its alphabetically ordered, is that intentional?

daphnawegner avatar Nov 11 '16 03:11 daphnawegner

@daphnawegner could you please explain why you are sorting the tree. Do you just want the children sorted when you reference them? Or, is it more complicated than that?

BebeSparkelSparkel avatar Nov 11 '16 04:11 BebeSparkelSparkel

I need to pass a sorted json representation of the tree

daphnawegner avatar Nov 11 '16 06:11 daphnawegner

Since you only need a json representation you probably don't need to sort the actual tree and only need to sort the json dictionary.

def sort_tree(tree_dict, key_value_function, reverse=None):

  def child_key_value(child_tag):
    if isinstance(child_tag, dict): child_tag = next(iter(child_tag.keys()))
    return key_value_function(next(tree.filter_nodes(lambda node: node.tag == child_tag)))

  if isinstance(tree_dict, dict):
    tree_dict[next(iter(tree_dict.keys()))]['children'].sort(key=child_key_value, reverse=reverse)
    for child in tree_dict[next(iter(tree_dict.keys()))]['children']:
      sort_tree(key_value_function, key_value_function)

  return tree_dict

You can then create your sort function lambda node: node.data['volume'].
Lastly, just take the sorted dictionary object and convert it to a json:

import json
json.dumps(sort_tree(
     tree.to_dict(),
    lambda node: node.data['volume']
))

Here is the test file that I created this in if you want it:

from treelib import Node, Tree
from pprint import pprint
import json

tree = Tree()
tree.create_node("Harry", "harry")  # root node
tree.create_node("Jane", "jane", parent="harry")
tree.create_node("Bill", "bill", parent="harry")
tree.create_node("Diane", "diane", parent="jane")
tree.create_node("Mary", "mary", parent="diane")
tree.create_node("Mark", "mark", parent="jane")


def sort_tree(tree_dict, key_value_function, reverse=None):

  def child_key_value(child_tag):
    if isinstance(child_tag, dict): child_tag = next(iter(child_tag.keys()))
    return key_value_function(next(tree.filter_nodes(lambda node: node.tag == child_tag)))

  if isinstance(tree_dict, dict):
    tree_dict[next(iter(tree_dict.keys()))]['children'].sort(key=child_key_value, reverse=reverse)
    for child in tree_dict[next(iter(tree_dict.keys()))]['children']:
      sort_tree(key_value_function, key_value_function)

  return tree_dict


pprint(tree.to_dict())
print()
print(json.dumps(sort_tree(tree.to_dict(), lambda node: node.tag, True)))

BebeSparkelSparkel avatar Nov 11 '16 17:11 BebeSparkelSparkel

Thanks! is filter_nodes on the latest version? or its not released yet?

daphnawegner avatar Nov 11 '16 17:11 daphnawegner

Running your example I get the following error:

return key_value_function(next(tree.filter_nodes(lambda node: node.tag == child_tag)))
TypeError: list object is not an iterator

daphnawegner avatar Nov 11 '16 18:11 daphnawegner

You must be using python 2.
I don't think filter_nodes is released yet on pypy so I switched to use filter.

This example should work for python 2 now.

from treelib import Node, Tree
from pprint import pprint
import json

tree = Tree()
tree.create_node("Harry", "harry")  # root node
tree.create_node("Jane", "jane", parent="harry")
tree.create_node("Bill", "bill", parent="harry")
tree.create_node("Diane", "diane", parent="jane")
tree.create_node("Mary", "mary", parent="diane")
tree.create_node("Mark", "mark", parent="jane")


def sort_tree(tree_dict, key_value_function, reverse=None):

  def child_key_value(child_tag):
    if isinstance(child_tag, dict): child_tag = next(iter(child_tag.keys()))
    return key_value_function(tree.filter_nodes(lambda node: node.tag == child_tag)[0])
    return key_value_function(filter(lambda node: node.tag == child_tag, tree.nodes)[0])

  if isinstance(tree_dict, dict):
    tree_dict[next(iter(tree_dict.keys()))]['children'].sort(key=child_key_value, reverse=reverse)
    for child in tree_dict[tree_dict.keys()[0]]['children']:
      sort_tree(key_value_function, key_value_function)

  return tree_dict


pprint(tree.to_dict())
print()
print(json.dumps(sort_tree(tree.to_dict(), lambda node: node.tag, True)))

BebeSparkelSparkel avatar Nov 11 '16 19:11 BebeSparkelSparkel

yes i am on python 2, it resolved that problem however it seems like node.tag is breaking now

return key_value_function(filter(lambda node: node.tag == child_tag, tree.nodes)[0])
AttributeError: 'str' object has no attribute 'tag'

daphnawegner avatar Nov 11 '16 19:11 daphnawegner

Unfortunately I not getting that error when I run the example in python 2.7 or 2.6, so I'm not sure what is going wrong on your system.

Did you tweak the example at all?

I've added my test file to https://github.com/BebeSparkelSparkel/treelib/blob/ordered_tree/order_tree.py so try running that file.

BebeSparkelSparkel avatar Nov 11 '16 19:11 BebeSparkelSparkel

The only thing I tweaked is removing the first return line:

def child_key_value(child_tag):
    if isinstance(child_tag, dict): child_tag = next(iter(child_tag.keys()))
    **return key_value_function(tree.filter_nodes(lambda node: node.tag == child_tag)[0])**
    return key_value_function(filter(lambda node: node.tag == child_tag, tree.nodes)[0])

daphnawegner avatar Nov 11 '16 20:11 daphnawegner

Ah. That would be the problem. I've removed the second in the file and it works. So, put back the first and remove the second.

BebeSparkelSparkel avatar Nov 11 '16 20:11 BebeSparkelSparkel

thanks a lot it works now!

daphnawegner avatar Nov 11 '16 20:11 daphnawegner

You're welcome. Good luck.

BebeSparkelSparkel avatar Nov 11 '16 20:11 BebeSparkelSparkel

It seems to work for the first level but not recursively on all children, running this:

from treelib import Node, Tree
from pprint import pprint
import json

tree = Tree()
tree.create_node("Harry", "harry",data={"volume":570}) # root node
tree.create_node("Jane", "jane", parent="harry",data={"volume":270})
tree.create_node("Bill", "bill", parent="harry" ,data={"volume":300})
tree.create_node("Diane", "diane", parent="jane",data={"volume":120})
tree.create_node("Mary", "mary", parent="diane",data={"volume":10})
tree.create_node("Mark", "mark", parent="jane",data={"volume":150})


def sort_tree(tree_dict, key_value_function, reverse=None):

  def child_key_value(child_tag):
    if isinstance(child_tag, dict): child_tag = next(iter(child_tag.keys()))
    return key_value_function(tree.filter_nodes(lambda node: node.tag == child_tag)[0])
    # return key_value_function(filter(lambda node: node.tag == child_tag, tree.nodes)[0])

  if isinstance(tree_dict, dict):
    tree_dict[next(iter(tree_dict.keys()))]['children'].sort(key=child_key_value, reverse=reverse)
    for child in tree_dict[tree_dict.keys()[0]]['children']:
      sort_tree(key_value_function, key_value_function)

  return tree_dict


pprint(tree.to_dict())
print()
print(json.dumps(sort_tree(tree.to_dict(with_data=True), lambda node: node.data['volume'], True)))

sorts the children of "Harry" but not "Jane" children

daphnawegner avatar Nov 11 '16 21:11 daphnawegner

Hello, is there any news on this? I am using treelib to order Reddit comments (who have a tree-like structure), and I would like to be able to order them by date, by number of votes... Being able to sort nodes on the same level by a key in the nodes would be an awesome feature.

Ailothaen avatar May 16 '20 18:05 Ailothaen

For now, I guess the best chance is putting something like "{:0>5}_".format(i) in front of every id in your desired order. That way the default sorter takes 00001...00015 as order. Not quite nice but it works.

Duoquote avatar Feb 15 '21 12:02 Duoquote

For me tree.show(key=False) , worked for tree view, it wont sort alphabetically.

salman0149 avatar Aug 19 '21 15:08 salman0149

@salman0149 thank you so much! How is this not in the documentation? :( I had to use zfill() to get it to sort properly which was turned out to be a nightmare when I was inserting new nodes at a later stage. Only after a long while I figured out that the tree actually has proper sorting, just the .show() methods returns alphabetical order.

@caesar0301 would you be able to upgrade the documentation? Treelib is a fantastic library but there's so much one has to figure out on their own to make good use of it.

tomol012 avatar Feb 13 '22 19:02 tomol012

@salman0149 thank you so much! How is this not in the documentation? :( I had to use zfill() to get it to sort properly which was turned out to be a nightmare when I was inserting new nodes at a later stage. Only after a long while I figured out that the tree actually has proper sorting, just the .show() methods returns alphabetical order.

@caesar0301 would you be able to upgrade the documentation? Treelib is a fantastic library but there's so much one has to figure out on their own to make good use of it.

Documentation PRs are always welcome! :)

liamlundy avatar Mar 31 '22 14:03 liamlundy