Emil Hvitfeldt
Emil Hvitfeldt
This would let the user keep the structure of the sentences, but still have a filtering effect. Should default to `NULL` for backwards compatability.
textclean is a nice package for cleaning otherwise messy strings. Show how that can be used together with {textrecipes} by using `step_mutate()`
This step will take a string/tokenlist and replace any emoji with a natural language label, that can then be used in downstream steps easier. Should have a `pre` and `post`...
This step should take a tokenlist, and 2 vectors of strings, `pattern` and `replacement`. Any token that matches `pattern` should be replaced with the corresponding `replacement`.
Suppose you have some long sequences. Have a step that will turn a observation with 1000 tokens into into 10 observations with 100, or more if the user allows overlapping
Once https://github.com/tidymodels/textrecipes/issues/147 is finished, it is going to be easier to write messages to the user related to the use steps. We don't generally do this for recipe steps, but...
This could also by returned by the `tidy.step_tf()` family
use `word2vec()` from https://github.com/EmilHvitfeldt/wordsalad
use `fasttext()` from https://github.com/EmilHvitfeldt/wordsalad