cspell icon indicating copy to clipboard operation
cspell copied to clipboard

[Bug]: TypeError: Cannot read properties of undefined (reading '0') in cspell-trie-lib

Open 1j01 opened this issue 1 year ago • 3 comments

Kind of Issue

Crash / Error

Tool or Library

cspell-trie

Version

8.3.2

Supporting Library

cspell-trie-lib

OS

All of them

OS Version

No response

Description

With a multi-lingual word list, the CSpell CLI throws an error while constructing a prefix tree from the dictionary. See this repro repo for more info.

Isaiah@Cardboard MINGW64 ~/Projects/cspell-bug-repro (main)
$ npx cspell-cli lint .
TypeError: Cannot read properties of undefined (reading '0')
    at new FastTrieBlobINode (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-trie-lib/dist/lib/TrieBlob/FastTrieBlobIRoot.js:17:27)
    at FastTrieBlobINode.child (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-trie-lib/dist/lib/TrieBlob/FastTrieBlobIRoot.js:77:16)
    at nodeWalker (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-trie-lib/dist/lib/ITrieNode/walker/walker.js:58:30)
    at nodeWalker.next (<anonymous>)
    at get size [as size] (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-dictionary/dist/SpellingDictionary/SpellingDictionaryFromTrie.js:43:51)      
    at file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-dictionary/dist/SpellingDictionary/SpellingDictionaryCollection.js:21:64
    at Array.sort (<anonymous>)
    at new SpellingDictionaryCollectionImpl (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-dictionary/dist/SpellingDictionary/SpellingDictionaryCollection.js:21:47)
    at createCollection (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-dictionary/dist/SpellingDictionary/SpellingDictionaryCollection.js:98:12)      
    at _getDictionaryInternal (file:///C:/Users/Isaiah/Projects/cspell-bug-repro/node_modules/cspell-lib/dist/esm/SpellingDictionary/Dictionaries.js:50:12)

Steps to Reproduce

  • Run cspell-cli lint . with the given configuration file, and it throws an error.
  • Also, open the cspell.json file in VS Code, and it reports a misspelling for one of the words in the accepted words list. Before trimming the word list, it reported even more words within the word list as misspelled.

Expected Behavior

  • cspell-cli lint . should not error.
  • No words in the words array in cspell.json should be underlined in VS Code.

Additional Information

There is likely a much smaller reproduction possible, but in the given configuration, removing any one word will make it fail to reproduce the bug. I have not tried simplifying the reproduction by modifying the words themselves, although this may be elucidatory.

cspell.json

{
	"ignorePaths": [
		".history", // VS Code "Local History" extension
		"node_modules"
	],
	"words": [
		"Æвзаг",
		"ajeļ",
		"allowfullscreen",
		"apng",
		"APNGs",
		"appinstalled",
		"Aragonés",
		"Asụsụ",
		"Avañe'ẽ",
		"Azərbaycan",
		"bepis",
		"bgcolor",
		"Bokmål",
		"Český",
		"Čeština",
		"classid",
		"cmaps",
		"ctype",
		"Cueŋƅ",
		"d'Òc",
		"desaturated",
		"DIALOGEX",
		"Divehi",
		"draggable",
		"ellipticals",
		"endonym",
		"eqeqeq",
		"equivalize",
		"ertical",
		"esque",
		"Eʋegbe",
		"eyedrop",
		"focusring",
		"Føroyskt",
		"fudgedness",
		"fullscreen",
		"Gàidhlig",
		"gazemouse",
		"GIFs",
		"Gikuyu",
		"grayscale",
		"headmouse",
		"hilight",
		"Hrvatski",
		"icns",
		"IFDs",
		"Íslenska",
		"Język",
		"jnordberg",
		"jspaint",
		"Kreyòl",
		"Kurdî",
		"Latviešu",
		"Lëtzebuergesch",
		"libtess",
		"Lietuvių",
		"Lingála",
		"llpaper",
		"localdomain",
		"localforage",
		"localizable",
		"lookpath",
		"lors",
		"ltres",
		"Macromedia",
		"nomine",
		"nostri",
		"nowrap",
		"occluder",
		"octree",
		"Oʻzbek",
		"oleobject",
		"orizontal",
		"ovaloids",
		"oviforms",
		"pako",
		"palettized",
		"paypal",
		"pointermove",
		"pointerup",
		"Português",
		"proxied",
		"pseudorandomly",
		"psppalette",
		"rbaycan",
		"redoable",
		"reenable",
		"repurposable",
		"rerender",
		"retargeted",
		"Română",
		"rotologo",
		"roundrects",
		"royskt",
		"rrect",
		"sandboxed",
		"scrollable",
		"scrollbars",
		"sketchpalette",
		"slenska",
		"Slovenčina",
		"Slovenščina",
		"Slovenský",
		"sorthweast",
		"soundcloud",
		"subrepo",
		"tbody",
		"themeable",
		"themepack",
		"Tiếng",
		"tileable",
		"timespan",
		"tina",
		"titlebar",
		"Toçikī",
		"togglable",
		"Tshivenḓa",
		"ufeff",
		"undock",
		"unfocusing",
		"uniquify",
		"unmaximize",
		"upiatun",
		"ustom",
		"UTIF",
		"vaporwave",
		"verts",
		"Việt",
		"viewports",
		"Volapük",
		"webglcontextlost",
		"webglcontextrestored",
		"Wikang",
		"WINTRAP",
		"Yângâ",
		"Zhōngwén",
		"zoomable",
		"zoomer",
		"zyk",
		"Ελληνικά",
		"Аҧсшәа",
		"Башҡорт",
		"Беларуская",
		"Език",
		"Ирон",
		"Језик",
		"Коми",
		"Қазақ",
		"Македонски",
		"Нохчийн",
		"Русский",
		"Словѣньскъ",
		"Српски",
		"Тоҷикӣ",
		"Түркмен",
		"Ўзбек",
		"Українська",
		"Чӑваш",
		"Чӗлхи",
		"Ѩзыкъ",
		"Ӏарул",
		"ქართული",
		"Հայերեն",
		"עברית",
		"أۇزبېك",
		"ئۇيغۇرچە",
		"اردو",
		"العربية",
		"بهاس",
		"پنجابی",
		"تاجیکی",
		"سندھی",
		"سنڌي",
		"فارسی",
		"كشميري",
		"ትግርኛ",
		"አማርኛ",
		"ພາສາລາວ",
		"ꦧꦱꦗꦮ",
		"ᐃᓄᒃᑎᑐᑦ",
		"ᐊᓂᔑᓈᐯᒧᐎᓐ",
		"ᓀᐦᐃᔭᐍᐏᐣ"
	]
}

cspell.config.yaml

No response

Example Repository

https://github.com/1j01/cspell-bug-repro

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

1j01 avatar Feb 03 '24 22:02 1j01

@1j01,

Thank you! I'll look into it. It should not error like that.

Jason3S avatar Feb 06 '24 06:02 Jason3S

@1j01,

The error given by cspell is not very informative. The spell checker failed to create an internal dictionary based upon the words found in the config. There is a limit on the number of unique characters in a dictionary. I'll look into a fix to make the limit much larger, but it might take a while.

The workaround is to have multiple dictionaries:

cspell.json

{
    "dictionaryDefinitions": [
        {
            "name": "words-latin",
            "path": "words-latin.txt"
        },
        {
            "name": "words-greek",
            "path": "words-greek.txt"
        },
        {
            "name": "words-cyrillic",
            "path": "words-cyrillic.txt"
        },
        {
            "name": "words-arabic",
            "path": "words-arabic.txt"
        },
        {
            "name": "words-inline",
            "words": [
                "DIALOGEX",
                "GIFs",
                "WINTRAP"
            ]
        }
    ],
    "dictionaries": [
        "words-latin",
        "words-greek",
        "words-cyrillic",
        "words-arabic",
        "words-inline"
    ]
}

Jason3S avatar Feb 06 '24 09:02 Jason3S

I'm re-opening this issue since I had to revert the changes in #5233 with #5281.

Jason3S avatar Feb 20 '24 07:02 Jason3S

The fix has been in for a while. Closing.

Jason3S avatar May 30 '24 10:05 Jason3S

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

github-actions[bot] avatar Jun 30 '24 05:06 github-actions[bot]