ConvoKit
ConvoKit copied to clipboard
ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the use...
### Description Introduces a new layer of abstraction between Corpus components (`Utterance`, `Speaker`, `Conversation`, `ConvoKitMeta`) and concrete data storage. Data storage is now handled by a `StorageManager` instance variable in...
Allow users to now create an empty corpus should they choose to do so by simply using Corpus(). Also implements further utilities such as adding individual utterances and adding individual...
For a recent project I'm working on, we're using [ConvoKit's implementation of the Supreme Court Oral Argument Corpus](https://convokit.cornell.edu/documentation/supreme.html). However, we'd really like to include data from after 2019. How difficult...
Implement .summarize() in Pairer.
We want to automate the running of all Jupyter notebooks on every change to ensure that notebooks run cleanly (and to serve as additional validation for the new changes). Some...
The current practice of leaving in deprecated constructor arguments is actually a bad practice because it can result in confusion when an informative IDE provides argument hints, for example: We...
I see the original WikiConv paper says there were conversations in Chinese collected, are these available through ConvoKit?
Hi Caleb @calebchiam, I'm trying to perform politeness prediction using the example notebook given [here](https://github.com/CornellNLP/Cornell-Conversational-Analysis-Toolkit/blob/master/examples/politeness-strategies/politeness_demo.ipynb). I run into some errors while adding dependency parses. Currently, I'm doing ``` from convokit...
Would be a nice QoL update to have tqdm used by default, especially when it comes to processing larger corpora. This could be a default argument in the `iter_()` methods.