pythainlp icon indicating copy to clipboard operation
pythainlp copied to clipboard

PyThaiNLP 3.1 change log

Open wannaphong opened this issue 3 years ago • 0 comments

Schedule

  • First development release: 31 August 2022
  • Beta release: 20 September 2022
  • Production release: 24 September 2022

See 3.1 Milestone.

What is new?

Deprecation and other API changes

#687 Remove deprecated function

  • pythainlp.word_vector; doesnt_match, get_model, most_similar_cosmul, sentence_vectorizer, similarity. use WordVector class instead
  • pythainlp.util.delete_tone. use pythainlp.util.remove_tonemark instead
  • Remove pythainlp.util.time_time. use pythainlp.util.time_to_thaiword instead
  • pythainlp.tokenize.syllable_tokenize. use pythainlp.tokenize.subword_tokenize instead

Dependency Parsing

  • Now, PyThaiNLP support dependency_parsing 🎉 Add pythainlp.parse.dependency_parsing https://github.com/PyThaiNLP/pythainlp/pull/706

Name Entity Tagging

  • #665 Add Thai-NNER pythainlp.tag.NNER
  • #658 Add LST20NER onnx model. It is LST20NER model to onnx model from fine-turning by WangchanBERTa model.

Transliteration

  • #659 Add ISO 11940 transliteration
  • #660 Add Thai W2P v0.2
  • #686 Add wunsen
  • #694 Wunsen Mandarin and Japanese update

PyThaiNLP Corpus downloader

  • #656 Add support zip/tar.gz to download corpus

Text normalization

  • #673 Add a normalising rule for Lakkhangyao ๅ

Translate

  • #674 add gpu option

Text summarize

  • #679 Add mt5 cpe kmutt thai sentence sum

Util

  • #682 Add live-dead syllable classification
  • #684 Add live dead syllable classify
  • #690 Add tone detector

Soundex

  • #699 Add Thai-English Cross-Language Transliterated Word Retrieval using Soundex Technique

Other

  • #689 map NG tag to PART
  • #691 Remove TinyDB as a dependency
  • #692 Fix notifications that newer versions of corpora are available
  • Add warning about LST20 license

wannaphong avatar Jan 29 '22 18:01 wannaphong