blog icon indicating copy to clipboard operation
blog copied to clipboard

Sentence Transformers v5.0 - Sparse Encoder Models

Open tomaarsen opened this issue 5 months ago • 0 comments

Hello!

Pull Request overview

  • Add the Sentence Transformers v5.0 blogpost introducing Sparse Encoder Models

Details

This blogpost is our latest "blogpostification" of our "Training Overview" documentation, much like I've done for https://huggingface.co/blog/train-sentence-transformers and https://huggingface.co/blog/train-reranker for v3 and v4, respectively. They are meant to be solid for SEO, with each of the prior blogposts showing up when looking for training embedding models or rerankers.

Because of this approach, the text has already been reviewed a few times, so a very thorough review may not be necessary.

Preparing the Article

You're not quite done yet, though. Please make sure to follow this process (as documented here):

  • [x] Add an entry to _blog.yml.
  • [x] Add a thumbnail. ~~There are no requirements here, but there is a template if it's helpful.~~ I'm reusing the older ones, as I've done for the v3.0 and v4.0 blogposts.
  • [x] Check you use a short title and blog path. Short enough, I hope
  • Upload any additional assets (such as images) to the Documentation Images repo. This is to reduce bloat in the GitHub base repo when cloning and pulling. Try to have small images to avoid a slow or expensive user experience.
  • [x] Add metadata (such as authors) to your md file. You can also specify guest or org for the authors.
  • [x] Ensure the publication date is correct.
  • [x] Preview the content. A quick way is to paste the markdown content in https://huggingface.co/new-blog. Do not click publish, this is just a way to do an early check.

Here is an example of a complete PR: https://github.com/huggingface/blog/pull/2382

Getting a Review

cc @pcuenca As mentioned above, a thorough review may not be necessary as the content is very similar to e.g. https://huggingface.co/blog/train-reranker. A double-check of the _blog.yml and metadata above the blog itself is definitely appreciated, though!

cc @arthurbr11 my co-author.

  • Tom Aarsen

tomaarsen avatar Jun 27 '25 11:06 tomaarsen