2016 icon indicating copy to clipboard operation
2016 copied to clipboard

Resources

Open dariusk opened this issue 8 years ago • 16 comments

This is an open issue where you can comment and add resources that might come in handy for NaNoGenMo.

There are already a ton of resources on the old resources threads for the 2013 edition, the 2014 edition, and the 2015 edition.

dariusk avatar Oct 22 '16 00:10 dariusk

Drawing on last year's projects, there's some things to learn from:

Emily Short's Annals of the Parrigues was an interesting project, but from a Resources perspective her end notes on writing for the generator are full of excellent suggestions for how to write your text and when to use different approaches.

Another project I was impressed with was A Time For Destiny by @cpressey. The "Story Compiler" write-up certainly suggests some avenues for future exploration. (And the discussion around it already inspired @enkiv2 to make a goal-driven planner plot generator.)

The Deserts of the West by @mewo2 has also been a source of inspiration; this National Geographic writeup is particularly illuminating, as are the blog posts about its language generator and map generator.

I could go on, there are a ton of interest projects to learn from. But I also want to emphasize that you don't need to get that fancy: if you're looking for an accessible way to jump into making your first book generator, you might want to check out Tracery. (There's also a Python port, if you need that.)

Also, the Gutenberg Python library is back in active development, should you need texts or metadata from Project Gutenberg.

ikarth avatar Oct 22 '16 15:10 ikarth

I wrote a series of blog posts on dev.to that focuses on NaNoGenMo and text generation. The emphasis of these blog posts is on how to produce "readable" computer-generated text that a human may theoretically like.

As a side-note, all of these blog posts are computer-generated as well (and a link to their source code is provided with each blog post) -- though the techniques I used here are hard to scale, since I needed to handwrite the corpus beforehand. Still useful as proofs of concepts.

I also provided links to several NaNoGenMo novels as well, so you could use these blog posts as a reference guide.

tra38 avatar Oct 23 '16 14:10 tra38

I should probably also mention these libraries:

  • SpaCy is a library for natural language processing in Python that I've been using instead of NLTK lately.
  • WaveFunctionCollapse has just become a thing in the past month or so. Original is in C#; there's a Javascript port and will probably have other implementations in the future. Originally intended for tiled images, but there's been promising work with text.

ikarth avatar Oct 24 '16 01:10 ikarth

This sense2vec thing (using SpaCy + word2vec) seems very promising.

dariusk avatar Oct 28 '16 04:10 dariusk

Emily Short posted a resource list yesterday on her blog: https://emshort.wordpress.com/2016/10/27/casual-procgen-text-tools/

On Fri, Oct 28, 2016 at 12:41 AM Darius Kazemi [email protected] wrote:

This sense2vec https://explosion.ai/blog/sense2vec-with-spacy thing (using SpaCy + word2vec) seems very promising.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NaNoGenMo/2016/issues/1#issuecomment-256834848, or mute the thread https://github.com/notifications/unsubscribe-auth/AAd6GcVRZCLWEjSc1dRKcw0JdtTGdZf8ks5q4XzpgaJpZM4KdsXR .

enkiv2 avatar Oct 28 '16 13:10 enkiv2

Ooh, @greg-kennedy just added to Corpora a list of adjectives to describe people. https://github.com/dariusk/corpora/commit/ccf894cfdcaf4894dfb4d77abb2fe9d69cbf77f1

dariusk avatar Nov 01 '16 16:11 dariusk

Created for last year's NaNoGenMo, I've updated this JSON of Project Gutenberg metadata:

https://github.com/hugovk/gutenberg-metadata

hugovk avatar Nov 01 '16 19:11 hugovk

If you need meter or rhyme for your poetic projects, some resources (mostly found after Emily Short asked a question on Twitter and many people replied with suggestions):

A set of word lists, organized by rythmic feet.

The CMU Pronouncing Dictionary is a dictionary of 134K English words and their pronunciations, including stress. NLTK and Allison Parrish's pronouncingpy library provide Python interfaces to it.

poem-gen, a 2014 NaNoGenMo project by Camden Segal may also be of interest. As is NaPoGenMo 2015 and NaPoGenMo 2016.

Some other resources (which may or may not have been mentioned in previous years):

textacy: higher-level NLP built on spaCy: streaming documents, filter linguistic elements, vectorized and semantic network representations, topic models, language identification...

TextBlob is another Python option for processing textual data and NLP.

KoNLPy: Korean NLP in Python

RiTa: JavaScript/Processing/Node NLP tools for computational literature

Pressagio text prediction system: word completions in Python, etc. (A Python port of Presage)

Lexeme: A constructed language word database, generation, and declension program.

Naive Text Summary Tool

moby: Javascript interface for the Moby Thesaurus

ikarth avatar Nov 01 '16 20:11 ikarth

I have a pattern recognizer I'm working on that will look at text and create 'templates' for phrases, with 'variables' where you can insert names, etc.

Example: "Hello, Bob" -> "Hello, {1}".

Currently it's only able to generate templates given two line of text, but I'm working on expanding it so it can scan an entire corpus and find the best template candidates, and convert them to templates. I'll post it here when I'm done.

superMDguy avatar Nov 03 '16 12:11 superMDguy

I should note that my scene-sequel project from last year has been broken out & generalized so that it can be used as a component in a larger project (say, by having some other piece of code generate the world-model), and so anybody who has an interest in using the fuzzy goal-follower code absolutely should: https://github.com/enkiv2/scene-sequel

On Thu, Nov 3, 2016 at 8:31 AM Matthew D. [email protected] wrote:

I have a pattern recognizer I'm working on that will look at text and create 'templates' for phrases, with 'variables' where you can insert names, etc.

Example: "Hello, Bob" -> "Hello, {1}".

Currently it's only able to generate templates given two line of text, but I'm working on expanding it so it can scan an entire corpus and find the best template candidates, and convert them to templates. I'll post it here when I'm done.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NaNoGenMo/2016/issues/1#issuecomment-258129667, or mute the thread https://github.com/notifications/unsubscribe-auth/AAd6GaG742W3h5kX_TaJqqeiyof5gF8Fks5q6dQfgaJpZM4KdsXR .

enkiv2 avatar Nov 03 '16 12:11 enkiv2

For those of you working in Java (or the JVM) the Stanford CoreNLP library just released a beta of version 3.7.0. (Interfaces also exist for many other programming languages.)

ikarth avatar Nov 03 '16 14:11 ikarth

@enkiv2 made ggc (Generative Grammar Compiler). This is useful for writing story templates.

superMDguy avatar Nov 03 '16 15:11 superMDguy

If anybody is working on poetry (or something where meter matters), this list of words grouped by part of speech and syllable count might be useful: http://www.ashley-bovan.co.uk/words/partsofspeech.html

enkiv2 avatar Nov 07 '16 15:11 enkiv2

Need character names? Here's a NodeJS module that spits out different names of characters from Infinite Jest by David Foster Wallace: https://github.com/accraze/infinitejest-names

accraze avatar Nov 22 '16 05:11 accraze

@accraze what is the license on this? It might be a good addition to dariusk/corpora in the names/ or literature/ section.

On Tue, Nov 22, 2016 at 12:29 AM Andy Craze [email protected] wrote:

Need character names? Here's a NodeJS module that spits out different names of characters from Infinite Jest by David Foster Wallace: https://github.com/accraze/infinitejest-names

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NaNoGenMo/2016/issues/1#issuecomment-262151339, or mute the thread https://github.com/notifications/unsubscribe-auth/AAd6Ga25h0A4HjB6Fv99_xdeiXQ3uVBXks5rAn2igaJpZM4KdsXR .

enkiv2 avatar Nov 22 '16 14:11 enkiv2

From https://github.com/NaNoGenMo/2016/issues/114#issuecomment-264015433:

A set of blog posts about writing Annales:

Annales: the gory details in three parts

  1. Vocabularies: using a neural network, Python and regular expressions to generate a nonsense vocabulary
  2. TextGen: a Haskell combinator library for making up randomised sentences (plus a one-paragraph explanation of how the State monad works!)
  3. Events: in which I get bogged down writing a succession algorithm, but also figure out how to correct a typo in a randomly-generated text

hugovk avatar Dec 10 '16 08:12 hugovk