markov-sentence-correction
markov-sentence-correction copied to clipboard
Markov Chains and Hidden Markov Models to generate and correct sentences
Sentence Generation and Correction
This is a submission to the second assignment of McGill University's ECSE 526 - Artificial Intelligence course. Details can be found here.
Setup
To run this, all one needs is Python 3.4 or above and pip3.
To install dependencies, simply run:
pip3 install -r requirements.txt
Details
This assignment has two parts:
Sentence Generation
Sentence are randomly generated using Markov models defined in data.
To run, simply execute:
python3 generator.py
More detailed output can be displayed using th e -v or --verbose flags.
Sentence Correction
Sentence are corrected using a Hidden Markov Model (HMM) and Viterbi's
algorithm. The intput is read from stdin, and the output is the most likely
sentence based on a first-order Markov chain and the Levenshtein distance.
To run, simply execute:
python3 corrector.py
Input can also be piped in as follows:
echo "Tell moi" | python3 corrector.py
# Tell me
More detailed output can be displayed using th e -v or --verbose flags.
Accuracy
Note that the accuracy of both the sentences generated and the corrections is only dependent on the data set used. Valid English words will be "corrected" if they are not found in the data set. Take a look here for details on how to use your own dataset.