jan icon indicating copy to clipboard operation
jan copied to clipboard

idea: Inline multimedia playback in conversational threads

Open halr9000 opened this issue 2 months ago • 0 comments

Problem Statement

This feature will significantly enhance the user experience by providing instant, in-context review of generated audio and video content. Currently, users must navigate away from the workspace to view video results, disrupting the creative process and diminishing the impact of motion media generation. Adding inline multimedia support will enhance the existing integration of static image display via Markdown.

Feature Idea

  • Seamless In-Line Playback: Implement HTML5 audio and video rendering functionality to allow direct playback of media files within the chat interface.
  • Essential Support for Key Models: Directly address the output of powerful content creation models like Kling, Wan, Sora, and Veo.
  • Align with Existing Image Support: Leverage the successful precedent of inline image display using Markdown (Alt Text), extending this functionality to video files for a consistent media experience.
  • Streamlined Iteration: Eliminate the friction of external review, enabling users to instantly check a video, provide feedback, and make changes without leaving the conversation.
  • High-Value Workflow Completion: Complete the last mile of the image-to-video and text-to-video workflow, transforming link outputs into immediate, high-impact media results.

Here's are screenshots showing how image generation can work well today using a Fal.ai MCP tool. Adding support for video would round this out very well. Note: the assistants often struggle with the fine points of tool usage. Success is very dependent on model and system prompt.

Image Image

halr9000 avatar Oct 05 '25 18:10 halr9000