rig refactor(fastembed): update version, remove unnecessary method

Fixes #390. The source of the issue was a breaking change.

EDIT: Note that this PR also removes the method for manually fetching ndims.

Apr 11 '25 10:04 joshua-mo-143

Potential clarifying Qs before we review this:

Do we want to create a version of the EmbeddingsBuilder for images? It currently supports Strings at the moment and that's it.
Do we want to extend ImageEmbeddingModel to additionally support embedding images straight from files/URLs? This seems like something that would be a relatively easy win and improve DX significantly.

Apr 12 '25 21:04 joshua-mo-143

De-scoping this PR for now, I think there is enough useful work here that we can meaningfully merge it and the image embeddings builder is likely to be a larger piece of work that will significantly delay this PR being able to get merged. Not particularly helpful as this PR was originally just meant to be a bug fix.

Apr 18 '25 11:04 joshua-mo-143

Do we want to create a version of the EmbeddingsBuilder for images? It currently supports Strings at the moment and that's it.

We could create a different version with a similar API, or use a typestate pattern for it. E.g.:

trait EmbeddingModel {
    type Document;
    
    // other stuff
}

// text embedding models would have `Document = String` while image embedding models 
// would have `Document = Vec<u8>` or something like that

struct EmbeddingBuilder<M> {
    pub fn document(mut self, document: M::Document) -> Self {...}
}

Not sure how feasible this is with the current EmbeddingBuilder<T: Embed> trait bound but something to think about.

Do we want to extend ImageEmbeddingModel to additionally support embedding images straight from files/URLs? This seems like something that would be a relatively easy win and improve DX significantly.

Would that just be a wrapper that loads the file in-memory? Or does the model provider actually accept a URL?

Apr 18 '25 15:04 cvauclair

Do we want to extend ImageEmbeddingModel to additionally support embedding images straight from files/URLs? This seems like something that would be a relatively easy win and improve DX significantly.

Would that just be a wrapper that loads the file in-memory? Or does the model provider actually accept a URL?

That would depend. The tl;dr is that the rig-fastembed library currently accepts straight up Vec<u8> (or &[u8] if we're being pendantic) but I imagine when it comes to managed models, we will need to do some further exploration. We currently don't actually support any other providers that support image embeddings directly.

Apr 19 '25 20:04 joshua-mo-143

Superceded by #480 (although we can re-use this PR for implementing image embeddings)

Jun 19 '25 22:06 joshua-mo-143