language-learning icon indicating copy to clipboard operation
language-learning copied to clipboard

OpenCog Unsupervised Language Learning

Results 28 language-learning issues
Sort by recently updated
recently updated
newest added

from @alexei-gl in https://github.com/singnet/language-learning/pull/243 : "please make unit test for .db dictionaries inn grammar-tester"

test

Implement "hybrid" parser blending sequential information and MI, so the extend of blending could be made configurable, with "maximum sequential" mode producing "sequential parse" and "maximum MI" mode producing "plain...

1. Fix the bug skipping unparsed words in test parses 2. Re-evaluate all parses in MWC-Study tab and update the links and numbers in the sheet (keep updating progress for...

doing

The goal of the challenge is to have unsupervisedly trained parser to create parses approximating "expected" English parses to the best extent - using cleaned Gutenberg Children corpus data as...

doing

When running the parse-evaluator in sequential or random mode, the parameter -t specifies where the sequential/random parses will be written. There is a bug and a theoretical problem with this:...

bug
enhancement

Study why tokenization is different for LG English and LG ANY and which problems may be raised by this and how it could be solved. Examples from Andres - specifically...

enhancement

**Problem:** Currently, Identical Lexical Entries (ILE) algorithm builds single-germ/multi-disjunct lexical entires (LE) first, and then aggregates identical ones based on unique combinations of disjuncts. That leads to fact that rarely...

enhancement

Cluster tags and words in tagged grammar .dict and cat_tree files. Either tagging or input parses filtering issue, OR issues in corpus preventing correct link extraction? Jupyter notebook -- [Iterative-clustering-ILE-POCE-CDS-2019-02-27.ipynb](https://github.com/singnet/language-learning/blob/master/notebooks/Iterative-clustering-ILE-POCE-CDS-2019-02-27.ipynb)...

bug

Few problems: 1. During iterative grammar learning, tagging words in input corpus and input parses may face ambiguity if the words with ats (@) in parses and corpus are translated...

We need to have Grammar Learner internal formats refactoring eventually, based on code review by @OlegBaskov: https://docs.google.com/document/d/1yauyi9Y9OD1Cefow197OTnGqm6ZDSqK1T-v6bR5CUHI/edit#heading=h.37kbmfpxjcy0

enhancement