pythainlp
pythainlp copied to clipboard
PyThaiNLP 3.1 change log
Schedule
- First development release: 31 August 2022
- Beta release: 20 September 2022
- Production release: 24 September 2022
See 3.1 Milestone.
What is new?
Deprecation and other API changes
#687 Remove deprecated function
- pythainlp.word_vector; doesnt_match, get_model, most_similar_cosmul, sentence_vectorizer, similarity. use WordVector class instead
- pythainlp.util.delete_tone. use pythainlp.util.remove_tonemark instead
- Remove pythainlp.util.time_time. use pythainlp.util.time_to_thaiword instead
- pythainlp.tokenize.syllable_tokenize. use pythainlp.tokenize.subword_tokenize instead
Dependency Parsing
- Now, PyThaiNLP support dependency_parsing 🎉 Add pythainlp.parse.dependency_parsing https://github.com/PyThaiNLP/pythainlp/pull/706
Name Entity Tagging
- #665 Add Thai-NNER
pythainlp.tag.NNER
- #658 Add LST20NER onnx model. It is LST20NER model to onnx model from fine-turning by WangchanBERTa model.
Transliteration
- #659 Add ISO 11940 transliteration
- #660 Add Thai W2P v0.2
- #686 Add wunsen
- #694 Wunsen Mandarin and Japanese update
PyThaiNLP Corpus downloader
- #656 Add support zip/tar.gz to download corpus
Text normalization
- #673 Add a normalising rule for Lakkhangyao ๅ
Translate
- #674 add gpu option
Text summarize
- #679 Add mt5 cpe kmutt thai sentence sum
Util
- #682 Add live-dead syllable classification
- #684 Add live dead syllable classify
- #690 Add tone detector
Soundex
- #699 Add Thai-English Cross-Language Transliterated Word Retrieval using Soundex Technique
Other
- #689 map NG tag to PART
- #691 Remove TinyDB as a dependency
- #692 Fix notifications that newer versions of corpora are available
- Add warning about LST20 license