John Cheung

Results 40 comments of John Cheung

> compose the dashed word features from their component parts I think that can be a possible way for doing it (a similar technique is implemented in tagger, see https://github.com/dhowe/ritajs/blob/b5b447c300739928aabb7f6f83577493695a5512/src/tagger.js#L495...

I made an implementation for analysing hyphenated word as one (above PR). The performance is not bad (not causing exec time warning in larger test pool, take ~40ms to execute...

> can you summarize me what you did here? So basically this algorithm treat hyphenated word as a sort of "phrase": breaks it down to parts and treat each part...

>also, we need to fix tests like this (should be 4 syllables): `eq(feats["syllables"], "s-t-ey-t-ah-v-dh-ah-aa-r-t");` just to confirm, the correct output should be `s-t-ey-t/ah-v/dh-ah/aa-r-t` ?

Now has 75 tests in 4 pools: - pool1 : all parts in lexicon - poo2A: some parts not in lexicon but are variants of words in lexicon - poo2B:...

sure, I will finish the tests for hyphenated words in sentences for tagger first. then sync

replaced it with `.replace(/([a-zA-Z]+)-([a-zA-Z]+)/g, "$1 - $2");` should work on all browsers now (maybe not IE...)

hmm > adding 'nn' as a 2nd tag for 'there' in the lexicon that doesn't work... I think actually 'there' should be tagged as 'rb' most frequently? like - "She...

Sorry, in the deleted comment I forgot to check if the word is also correctly tagged here is the past part that need to be added `const IRREG_PAST_PART_NOT_IN_DICT = ["abode","begotten","bidden","borne","chlung","could","mown","pled","relaid","shod","smelt","spelt","spolit","taight","wrung"];`...

I generate the list simply by going over `IRREG_PAST_PART` in conjugator.js and checking if lexicon has that entry and if it is tagged as `vbd` or `vbn`. I will check...