Anton Kolonin
Anton Kolonin
The goals of Unsupervised Segmentation Learning (USL) are: 1) Unsupervised learn of lexicon and tokenisation for languages like Chinese 2) Unsupervised learn for sentence splitting for languages like Chinese 3)...
_This MSc student research task may or may not end-up with actual code submitted into this repository_ Main Goal: **Exploration of applicability of explainable artificial intelligence technique for sentiment analysis...
Need to add costs to rules based on statistics for better parsing with account to these costs. First, this has to be done for GL ILE algorithm and we see...
**Specification** This task address complication of the Unsupervised Language Learning (ULL) problem (specifically, Unsupervised Text Parsing) with the following conditions: 1. The input (training) streams of symbols is not segmented...
Two problems: 1) Inconsistent rounding: http://langlearn.singularitynet.io/data/aglushchenko_parses/GCB-NQ-dILEd-MWC-MSL-2019-04-24/GCB-NQ-dILEd-MWC-MSL-summary.txt 0 2 99.60% 1.00 Need to round F1 to 4 decimal places after period so the rounding appears consistent. 2) Some words in the...
As discussed with @linas in email thread and @OlegBaskov and @glicerico in Slack: Currently, Grammar Learner (GL) clusters words into word categories or Link Grammar (LG) rules but it does...
Need to see how to make MST-Parses on large corpora to work faster, below is the discussion: @glicerico : It now takes around 100 hours to perform MST-parses using the...
Implement "hybrid" parser blending sequential information and MI, so the extend of blending could be made configurable, with "maximum sequential" mode producing "sequential parse" and "maximum MI" mode producing "plain...
1. Fix the bug skipping unparsed words in test parses 2. Re-evaluate all parses in MWC-Study tab and update the links and numbers in the sheet (keep updating progress for...
The goal of the challenge is to have unsupervisedly trained parser to create parses approximating "expected" English parses to the best extent - using cleaned Gutenberg Children corpus data as...