english-words
english-words copied to clipboard
Several non-English words made it into the list
German words ending in -schen: boeschen, goschen, groschen, guldengroschen, hamantaschen, hanschen, kischen, leschen, mariengroschen, menschen, neugkroschen, neugroschen, silbergroschen.
German words ending in -ung: anschauung, aufklarung, delundung, aufklrung, gelandesprung, geldesprung, gelndesprung (the last four are also spelled wrongly), gterdmerung (probably ASCII-filtered from a wrongly-spelled "Götterdämmerung"), kaolikung, lautverschiebung, quellung, quersprung, sturmabteilung, verwanderung, vorstellung.
German-sounding place names that end in -berg: aaberg, amberg, arlberg, baden-wtemberg (should be Baden-Wurttenberg), bamberg, beberg, bemberg, berg, bloxberg, bromberg, bundaberg, clayberg, cohberg, desberg, drakensberg, dusenberg, egeberg, ehrenberg, eisenberg (probably Heisenberg), faberg, feinberg, fineberg, flamberg, floeberg, freberg, frederiksberg, freudberg, friedberg, fromberg, ginsberg, ginzberg, godesberg, goldberg, goldenberg, gomberg, greenberg, grosberg, gruenberg, grunberg, gutenberg, guttenberg, hamberg, hardenberg, hedberg, heidelberg, heisenberg, hertberg, herzberg, hollenberg, houlberg, ingaberg, ingeberg, inselberg, judenberg, kapfenberg, knigsberg (probably Konigsberg), koenigsberg, konigsberg, kornberg, landenberg, lansberg, lederberg, lemberg, lichtenberg, lindberg, lindeberg, lundberg, marshallberg, memberg, mengelberg, moberg, mollberg, mossberg, msterberg (probably Musterberg), muhlenberg, nberg (probably Nueberg), newberg, nyberg, nieberg, noonberg, nuremberg, oberg, overberg, ramberg, rehnberg, reichenberg, rydberg, romberg, rosenberg, rotberg, rothberg, rothenberg, schberg, schoenberg, schonberg, schulberg, shimberg, shinberg, shirberg, sjoberg, slosberg, solberg, spitzenberg, steinberg, sternberg, stormberg, strasberg, strindberg, stromberg, sundberg, svedberg, taberg, tamberg, tanberg, tannenberg, tuneberg, vandenberg, venusberg, vilberg, vorarlberg, waterberg, wattenberg, weinberg, weisberg, weissberg, westberg, wittenberg, wtemberg, wurttemberg,
Swedish-sounding names ending in -borg: aalborg, bjneborg, carlsborg, friborg, goteborg, gteborg, helsingborg, hsingborg, ingaborg, ingeborg, kreymborg, lindsborg, seaborg, swedeborg, swedenborg, valborg, viborg, vyborg, volborg, wiborg.
German words with -rsch- or -wasser: goldwasser, kirschwasser, beterschap, borsch, borsches, bursch, burschenschaft, burschenschaften, clairschach, clairschacher, dauerschlaf, hersch, herschel, herscher, hirsch, hirschfeld, kirsches, kirschner, kursch, lautverschiebung, moersch.
German words that might be considered controversial: sieg, heil, hitler, mein, fuehrer, fuhrer, gott, mit, uns, SchutzStaffel.
French sounding words containing "aux": aboideaux, aboiteaux, agneaux, auxf, auxier, auxil, auxvasse, bandeaux, bateaux, batteaux, beaux, beaux-arts, beaux-esprits, beauxite (should be bauxite), boyaux, boisseaux, bordereaux, boudreaux, capiteaux, carpeaux, castrop-rauxel, chalumeaux, chapeaux, chateaux, cheneaux, chevaux, chevaux-de-frise, ciseaux, clervaux, clitoridauxe, colauxe, coteaux, couteaux, cryptoglaux, cristineaux, dermatauxe, eaux, enterauxe, esquimaux, fabliaux, faux, fauxbourdon, faux-bourdon, faux-na, flambeaux, fricandeaux, gateaux, glaux, hanotaux, hemiauxin, hepatauxe, jambeaux, jouhaux, kastrop-rauxel, knisteneaux, kristinaux, lascaux, laux, malraux, mantappeaux, manteaux, margaux, margeaux, marivaux, mastauxe, maux, meraux, michaux, myelauxe, morceaux, moureaux, nephrauxe, nouveaux, oophorauxe, paravauxite, pauxi, plateaux, portmanteaux, proces-verbaux, prostatauxe, radeaux, raveaux, reseaux, rinceaux, roncevaux, rondeaux, rouleaux, salteaux, splenauxe, subbureaux, tableaux, thibodaux, tonneaux, torteaux, trichauxis, trousseaux, trumeaux, vassaux, vauxite, veneaux, vitraux, wibaux, bureaux.
Loan words that are probably okay but, strictly, still not English: brehmsstrahlung, weltanschauung, volkerwanderung, ubermensch, borscht, borschts, kirsch, meerschaum, meerschaums, Messerschmitt, Rorschach, bordeaux.
East Asian words: Wa-palaung, Telukbetung, bagong, Ronggeng.
Chinese city name: Tzekung, Kaolikung
Korean name: Kyung, Kyaung, Keung.
Kyrgyz name: Issyk-Kul
Icelandic name: Jokul (should be Jokull)
Not sure if this counts: mallangong (Australian name for the platypus), wobbegong (Australian name of the carpet shark)
@lserni What was your methodology for collecting this?
Funny timing, as I sat down today to write a script to remove every single non-word / non-english-word from this list.
In addition to the language stuff you pointed out, I have noticed a number of other non-english words and 'artifacts' of English. Gutteral sounds. etc.
I stumbled across one word due to a misspelling (it might have been 'kischen'), I realized it was German and started looking for that suffix ("-schen"), then one thing lead to another (e.g. finding "kirschen" led me to look for "kirsch" and so on).
@lserni Gotcha. So from a processing/scripting standpoint, it was more or less a "manual/organic" search process?
I plan to programmatically parse this to get a pure list of "guaranteed English" words.
I've thought up two general strategies so far:
-
Use chatGPT to ask if xxxxx is an English word (I've tried this before and maybe my prompt was bad, but ChatGPT wasn't great at that task... English "borrows" a lot of words and chatGPT seemed to not know where to draw the boundary.)
-
Write a script to lookup every single word in an existing dictionary (Google, Oxford, Wiktionary, etc) and keep the words that had entries.
Please lmk if you have any ideas 💯
I tried using some python dictionary modules, but even online database ones are missing some words I know are real.