malteos
malteos
awesome-document-similarity
A curated list of resources on document similarity measures (papers, tutorials, code, ...)
aspect-document-similarity
Implementation, trained models and result data for the paper "Aspect-based Document Similarity for Research Papers" #COLING2020
legal-document-similarity
Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal Literature Recommendations"
pytorch-bert-document-classification
Enriching BERT with Knowledge Graph Embedding for Document Classification (PyTorch)
scincl
Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)
semantic-document-relations
Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles"
llm-datasets
A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.
clp-transfer
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning