disco-dop icon indicating copy to clipboard operation
disco-dop copied to clipboard

Re-implement NLTK tree

Open andreasvc opened this issue 12 years ago • 0 comments

Would allow a potentially significant speedup for treebank transformations and grammar extraction.

Wishlist:

  • represent all treebank information: functions, morphology, lemmas, &c.
  • combine indices and words in one datastructure
  • parent pointers, sibling pointers
  • store yield of each node, i.e., tree.leaves(). modification of tree triggers update in all ancestors.
  • automatic canonicalization
  • mutable and immutable versions. immutable version could use C arrays/structs.
  • perhaps specific optimized version for binary trees.

andreasvc avatar Dec 18 '12 16:12 andreasvc