spaCy Add visualisations for parsed documents

Add visualisations for parsed documents

Open richardpaulhudson opened this issue 3 years ago • 1 comments

Description

This will add three visualisations of information from parsed documents:

a table with rows for consecutive tokens in the document; columns are feature values and/or dependency trees
a text interspersed with specific feature labels for individual tokens
a table containing rows for tokens in the document with specific features or feature values; optionally rows for tokens before and after matching tokens can be rendered as well; columns are feature values

All three visualisations are heavily configurable as to colours, spacings etc.

Remaining TODOs are:

Enable the specification of parts of a document to render
Enable config-file based configuration of the three visualisations
Develop a standard default configuration for each visualisation
Integrate the functionality into the CLI

Types of change

It is a group of new features.

Checklist

[x] I confirm that I have the right to submit this contribution under the project's MIT license.
[ ] I ran the tests, and all new and existing tests passed.
[ ] My changes don't require a change to the documentation, or if they do, I've added all required information.

Dec 23 '21 15:12 richardpaulhudson

One difficulty with this PR is that the proliferation of ANSI control characters and carriage returns mean that the tests are mostly not human-readable. They can however be easily understood by adding a print() statement to each test to display the output and running pytest with an appropriate option like -rP to display standard output.

It would be non-standard, but I wonder whether it might even make sense to retain the print()s in the repository?

Mar 25 '22 06:03 richardpaulhudson

I think text visualizations will be a great addition!

It feels like this should fit under displacy rather than as a method on Doc, but I'm not sure about the API/naming.

Jan 31 '23 09:01 adrianeboyd

Closing for now & adding to our internal backlog

Jun 15 '23 22:06 svlandeg

spaCy spaCy copied to clipboard

Add visualisations for parsed documents

Description

Types of change

Checklist

spaCy
spaCy copied to clipboard