Embeddings abstraction ADR
Motivation and Context
Currently embeddings abstraction is limited. It doesn't allow passing metadata or additional execution settings. For example: VertexAI embeddings model has optional parameters, returns metadata and generates embeddings from multi-modal content.
Description
Related to #5276
@dmytrostruk sounds good, I will update the ADR later on 😄
@Krzysztof318 After further brainstorming about this design, I have more ideas to share. But I will need a little bit more time to collect all the information and discuss it with team before making any decisions. So, let's wait with this ADR for a bit, and I will get back to you as soon as more information is available. Thank you!
@dmytrostruk no problem, currently I am focused on gemini PR, so I'd like to take care for this ADR next week
@dmytrostruk can we finish this adr? What else you thoughts have?
@dmytrostruk can we finish this adr? What else you thoughts have?
Hi @Krzysztof318 ! Based on latest discussions we agreed on the following:
- Keep generic parameter for Embedding type (e.g. int, float or double) for future extensibility. Although, currently it has
unmanagedconstraint, so we should remove it. That will mean to keepEmbeddingContent<TEmbedding>generic as well. - Data input should be
IEnumerable<TContent>rather thanIList<TContent>to avoid forcing users to performToListin case they have enumerable. - Generic type for content should have a constraint to be derived from
KernelContentand it should be contravariant (e.g.IEmbeddingGenerationService<in TContent, TEmbedding>. It will allow users to castIEmbeddingGenerationService<KernelContent, float>to more specificIEmbeddingGenerationService<TextContent, float>.
So, the final abstraction should look like this:
public interface IEmbeddingGenerationService<in TContent, TEmbedding> : IAIService
where TContent : KernelContent
{
Task<IReadOnlyList<EmbeddingContent<TEmbedding>>> GenerateEmbeddingsAsync(
IEnumerable<TContent> data,
PromptExecutionSettings? executionSettings = null,
Kernel? kernel = null,
CancellationToken cancellationToken = default);
}
public class EmbeddingContent<TEmbedding> : KernelContent
{
public EmbeddingContent(
ReadOnlyMemory<TEmbedding> data,
string? modelId = null,
object? innerContent = null,
IReadOnlyDictionary<string, object?>? metadata = null)
: base(innerContent, modelId, metadata)
{
this.Data = data;
}
public ReadOnlyMemory<TEmbedding> Data { get; set; }
}
Please let me know if you have any questions. Thank you!
@dmytrostruk It's ok, I have updated ADR but added only sample under "decision outcome" section. Look please and update ADR ; pros, cons, information because you are better oriented what should be there written.
Hi @dmytrostruk when will you have time to finish that?
Hi @dmytrostruk when will you have time to finish that?
Hi @Krzysztof318 , this issue should be prioritized based on other memory-related work which is currently in progress. cc: @markwallace-microsoft @matthewbolanos
@Krzysztof318 The plan is to include this as part of the feature graduation work which is being tracked here #8094