orangetext
orangetext copied to clipboard
Data uses multiple encodings
It appears the data are stored with different encodings
$ file -I orangetext/data/speeches/*
orangetext/data/speeches/2016-01-19-presidential-candidacy-anouncement-NewYorkCity-NY.txt: text/plain; charset=unknown-8bit
orangetext/data/speeches/2016-08-31-immigration-Phoenix-AZ.txt: text/plain; charset=unknown-8bit
orangetext/data/speeches/2016-10-13-addressing-sexual-assault-WestPalmBeach-FL.txt: text/plain; charset=us-ascii
orangetext/data/speeches/2017-01-20-inaugural.txt: text/plain; charset=utf-8
orangetext/data/speeches/2017-01-21-cia.txt: text/plain; charset=utf-8
orangetext/data/speeches/2017-01-28-may.txt: text/plain; charset=utf-8
orangetext/data/speeches/2017-01-29-weekly-address.txt: text/plain; charset=us-ascii
orangetext/data/speeches/2017-01-31-gorsuch.txt: text/plain; charset=utf-8
orangetext/data/speeches/2017-02-01-black-history-month.txt: text/plain; charset=unknown-8bit
orangetext/data/speeches/2017-02-03-weekly-address.txt: text/plain; charset=utf-8
Would you be open to choosing a single encoding?