language-model topic
nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
haystack
:mag: AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your d...
zeroth
Kaldi-based Korean ASR (한국어 음성인식) open-source project
bert_language_understanding
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
bluebert
BlueBERT, pre-trained on PubMed abstracts and clinical notes (MIMIC-III).
AzureML-BERT
End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service
RobBERT
A Dutch RoBERTa-based language model
tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
keras-bert
Implementation of BERT that could load official pre-trained models for feature extraction and prediction
spacy-transformers
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy