spellcorrect How to train my custom data?

How to train my custom data?

Open GabrielLin opened this issue 7 years ago • 5 comments

trafficstars

Thanks.

Mar 03 '18 10:03 GabrielLin

All the data is stored in .data files. You can modify them, update them or replace them with your own data processed, probably by a simple script to match the expected format. Unfortunately, I didn't spend much time then to define the models declaratively, but it should be easy to decompose visually. Once your data matches the expected format, the script should be able to train itself at startup.

Mar 09 '18 03:03 jbhoosreddy

Hi @jbhoosreddy , thanks for the repo and the data. However, can you throw some light on how to create the confusion matrices (dictionary) if I have a list of unigrams (from Google 1T) with their frequencies?

Dec 05 '18 13:12 acerock6

Thanks for your solution. @jbhoosreddy . Sorry for the late reply.

Apr 08 '19 10:04 GabrielLin

Hi @jbhoosreddy , thanks for the repo and the data. However, can you throw some light on how to create the confusion matrices (dictionary) if I have a list of unigrams (from Google 1T) with their frequencies?

The confusion matrix used in this program comes from the paper A Spelling Correction Program Based on a Noisy Channel Model.

Jul 27 '19 12:07 yhshu

Hey @lzw429! Thanks for identifying where this data came from.

My earliest recollection is that I saw this data in a textbook and attempted to recreate the data and pseudocode to validate for myself that the spell correct approach works.

May 19 '20 21:05 jbhoosreddy

spellcorrect spellcorrect copied to clipboard

How to train my custom data?

spellcorrect
spellcorrect copied to clipboard