interactive icon indicating copy to clipboard operation
interactive copied to clipboard

ImageGenerator in Microsoft.DotNet.Interactive.AI encounters Semantic Kernel failures.

Open IntegerMan opened this issue 1 year ago • 2 comments

Describe the bug

Using the image generation notebook hosted at https://github.com/dotnet/interactive/blob/main/samples/notebooks/ai/image%20generation.ipynb I provided my own API key and endpoint pointing to an Azure OpenAI resource in East US. I then updated the deployment name to match my deployed DALL-E-2 or DALL-E-3 instance and ran the notebook with no other changes.

I consistently received the following stack trace trying to generate images:

Error: Microsoft.SemanticKernel.Diagnostics.SKException: Reached maximum retry attempts
at Microsoft.SemanticKernel.Connectors.AI.OpenAI.ImageGeneration.AzureOpenAIImageGeneration.GetImageGenerationResultAsync(String operationId, CancellationToken cancellationToken)
at Microsoft.SemanticKernel.Connectors.AI.OpenAI.ImageGeneration.AzureOpenAIImageGeneration.GenerateImageAsync(String description, Int32 width, Int32 height, CancellationToken cancellationToken)
at Microsoft.DotNet.Interactive.AI.ImageGenerationKernel.Microsoft.DotNet.Interactive.IKernelCommandHandler<Microsoft.DotNet.Interactive.Commands.SubmitCode>.HandleAsync(SubmitCode submitCode, KernelInvocationContext context)
at Microsoft.DotNet.Interactive.Kernel.HandleAsync(KernelCommand command, KernelInvocationContext context) in D:\a\_work\1\s\src\Microsoft.DotNet.Interactive\Kernel.cs:line 330
at Microsoft.DotNet.Interactive.KernelCommandPipeline.<BuildPipeline>b__6_0(KernelCommand command, KernelInvocationContext context, KernelPipelineContinuation _) in D:\a\_work\1\s\src\Microsoft.DotNet.Interactive\KernelCommandPipeline.cs:line 60
at Microsoft.DotNet.Interactive.KernelCommandPipeline.SendAsync(KernelCommand command, KernelInvocationContext context) in D:\a\_work\1\s\src\Microsoft.DotNet.Interactive\KernelCommandPipeline.cs:line 41

I suspect something is out-of-date with the AI package, semantic kernel, and / or SK's image generation connectors.

Updating the version of Microsoft.DotNet.Interactive.AI via *-* causes the version installed to jump from 1.0.0-beta.23604.2 to 1.0.0-beta.24074.1 but still results in a failure, though with a different stack trace:

Error: Microsoft.SemanticKernel.KernelException: Service of type 'Microsoft.SemanticKernel.TextToImage.ITextToImageService' and key 'image_image_generator' not registered.
at Microsoft.SemanticKernel.Kernel.GetRequiredService[T](Object serviceKey)
at Microsoft.DotNet.Interactive.AI.ImageGenerationKernel.Microsoft.DotNet.Interactive.IKernelCommandHandler<Microsoft.DotNet.Interactive.Commands.SubmitCode>.HandleAsync(SubmitCode submitCode, KernelInvocationContext context)
at Microsoft.DotNet.Interactive.Kernel.HandleAsync(KernelCommand command, KernelInvocationContext context) in D:\a\_work\1\s\src\Microsoft.DotNet.Interactive\Kernel.cs:line 330
at Microsoft.DotNet.Interactive.KernelCommandPipeline.<BuildPipeline>b__6_0(KernelCommand command, KernelInvocationContext context, KernelPipelineContinuation _) in D:\a\_work\1\s\src\Microsoft.DotNet.Interactive\KernelCommandPipeline.cs:line 60
at Microsoft.DotNet.Interactive.KernelCommandPipeline.SendAsync(KernelCommand command, KernelInvocationContext context) in D:\a\_work\1\s\src\Microsoft.DotNet.Interactive\KernelCommandPipeline.cs:line 41

If it impacts the urgency of this, I was hoping to include this feature in my upcoming book and am prepping to write the chapter on AI in Polyglot Notebooks.

Please complete the following:

Which version of .NET Interactive are you using? 1.0.522904+cdfa48b2ea1a27dfe0f545c42a34fd3ec7119074

  • OS
    • [ ] Windows 11
    • [ ] Windows 10
    • [ ] macOS
    • [x] Linux - Pop!_OS (Ubuntu 22.06)
    • [ ] iOS
    • [ ] Android
  • Browser
    • [ ] Chrome
    • [ ] Edge
    • [ ] Firefox
    • [ ] Safari
  • Frontend
    • [ ] Jupyter Notebook
    • [ ] Jupyter Lab
    • [ ] nteract
    • [x] Visual Studio Code
    • [ ] Visual Studio Code Insiders
    • [ ] Visual Studio
    • [ ] Other (please specify)

Screenshots

If applicable, add screenshots to help explain your problem.

IntegerMan avatar Jul 02 '24 03:07 IntegerMan

I have now verified this occurs on Windows environments as well as Linux.

IntegerMan avatar Jul 08 '24 03:07 IntegerMan

I recommend working directly with DALL-E for the time being while this issue is open using Azure.AI.OpenAI. Assuming you have an AzureOpenAIClient named azureClient, use the following code:

var imageClient = azureClient.GetImageClient("your-dall-e-deployment-name");
var imgResult = imageClient.GenerateImage("A picture of a bug in the DALL-E integration");

// Set up a .NET Interactive formatter for OpenAI.Images.GeneratedImage to render the image as HTML
Microsoft.DotNet.Interactive.Formatting.Formatter.Register<OpenAI.Images.GeneratedImage>((image, writer) =>
{
    writer.Write($"<img src=\"{image.ImageUri}\" />");
    writer.Write($"<p>{image.RevisedPrompt}</p>");
}, "text/html");

imgResult.Value

IntegerMan avatar Jul 14 '24 18:07 IntegerMan