python-corenlp-protobuf icon indicating copy to clipboard operation
python-corenlp-protobuf copied to clipboard

Python bindings for Stanford CoreNLP's protobufs.

Stanford CoreNLP Python Bindings

.. image:: https://travis-ci.org/stanfordnlp/python-corenlp-protobuf.svg?branch=master :target: https://travis-ci.org/stanfordnlp/python-corenlp-protobuf

This package contains python bindings for Stanford CoreNLP <https://github.com/stanfordnlp/CoreNLP>'s protobuf specifications, as generated by protoc. These bindings can used to parse binary data produced by, e.g., the Stanford CoreNLP server <https://stanfordnlp.github.io/CoreNLP/corenlp-server.html>.


Usage:

.. code-block:: python

from corenlp_protobuf import Document, parseFromDelimitedString

document.dat contains a serialized Document.

with open('document.dat', 'r') as f: buf = f.read() doc = Document() parseFromDelimitedString(doc, buf)

You can access the sentences from doc.sentence.

sentence = doc.sentence[0]

You can access any property within a sentence.

print(sentence.text)

Likewise for tokens

token = sentence.token[0] print(token.lemma)

See test_read.py for more examples.