Oliver Sherouse
Oliver Sherouse
If the cache is corrupted, it should be deleted instead of raising an error, to prevent problems like #62.
Allow sequences of length 2 for dates even though it should *really* be a tuple. Contributes to #71.
Corpus.builtins => analysis.nlp Estimator prediction => analysis.ml
https://github.com/QuantGov/quantgov/blob/5c22c6ee9cd1c702561b881bcf5dd38097d7ff99/quantgov/corpora/builtins.py#L123 We need the `term_counts` dictionary keys to be the actual regexes, not the string matches. We should map the one to the other and then generates results from the...
At the very least, should have a live check that warns if the index isn't unique. Should do this on loading when that's cheap.
Include a way to get raw text out of a variety of formats. [Textract](https://textract.readthedocs.io/en/stable/) may be the way, but I've had trouble with it in Anaconda.