litbank
litbank copied to clipboard
Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.
Hello, thank you very much for such a great data set of open source. I wonder if you can provide a simple experiment of coreference resolution?
Bumps [numpy](https://github.com/numpy/numpy) from 1.16.4 to 1.22.0. Release notes Sourced from numpy's releases. v1.22.0 NumPy 1.22.0 Release Notes NumPy 1.22.0 is a big release featuring the work of 153 contributors spread...
BIO format has been transformed to BILOU format within conll format using (bio2biluo.py)[https://github.com/ufal/acl2019_nested_ner/blob/master/bio2bilou.py]. Lemmatization and POS tags have been done using(UDPipe) (http://ufal.mff.cuni.cz/udpipe) The columns are as follows. * FORM: Word...
Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.25.3 to 1.26.5. Release notes Sourced from urllib3's releases. 1.26.5 :warning: IMPORTANT: urllib3 v2.0 will drop support for Python 2: Read more in the v2.0 Roadmap Fixed...
spaCy is quickly becoming the most popular NLP module in Python, but alas, I cannot locate any models for spaCy specifically trained on literary data. Are there any plans to...
Hello! Thank you for making this really cool dataset publicly available :) I'm trying to align the annotations and the original text, could you please specify what tokenizer was used...
It would help to have the conf files for brat along with, if you have those too...
Hi, We noticed that the quotation data does not contain a lot of lines, for many files it doesn't contain any (in the brat format). Is this example data, and...
Bumps [certifi](https://github.com/certifi/python-certifi) from 2019.6.16 to 2022.12.7. Commits 9e9e840 2022.12.07 b81bdb2 2022.09.24 939a28f 2022.09.14 aca828a 2022.06.15.2 de0eae1 Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ... b8eb5e9 2022.06.15.1...
Hi, Can you give some more explanation on how to use the cl-coref annotator to generate corefrence resolution training data? Thanks in advance!