extensions icon indicating copy to clipboard operation
extensions copied to clipboard

[API Proposal]: Streaming methods for IEmbeddingGenerator

Open azchohfi opened this issue 1 year ago • 0 comments

Background and motivation

The IEmbeddingGenerator interface doesn't support streaming, which makes sense mostly with batching (for remote/cloud implementations) or with local embeddings models, that runs much slower in the CPU, for example.

API Proposal

namespace System.Collections.Generic;

public class LocalEmbeddingGenerator : IEmbeddingGenerator<string, Embedding<float>>
{
  public async IAsyncEnumerable<Embedding<float>> GenerateStreamingAsync(
      IEnumerable<string> values,
      EmbeddingGenerationOptions? options = null,
      [EnumeratorCancellation] CancellationToken cancellationToken = default)
  {
      int chunkSize = 128;
  
      var chunks = values.Chunk(chunkSize);
  
      foreach (var chunk in chunks)
      {
          cancellationToken.ThrowIfCancellationRequested();
  
          GeneratedEmbeddings<Embedding<float>> embeddings = await GenerateAsync(chunk, options, cancellationToken).ConfigureAwait(false);
  
          foreach (var embedding in embeddings)
          {
              cancellationToken.ThrowIfCancellationRequested();
  
              yield return embedding;
          }
      }
  }
...
}

API Usage

IEmbeddingGenerator<string, Embedding<float>> localEmbeddings = new LocalEmbeddingGenerator();
await foreach (var embedding in localEmbeddings.GenerateStreamingAsync(largeEnumerable, null, cts.Token))
{
   ...
}

Alternative Designs

Such method is likely going to be similar between different implementations, so maybe an extension method would suffice.

Risks

It does make the implementation slightly more complex, and maybe the existing implementations would only call the GenerateAsync method, without leveraging chunking or a similar approach.

azchohfi avatar Oct 21 '24 19:10 azchohfi