jan icon indicating copy to clipboard operation
jan copied to clipboard

epic: Jan can See MVP

Open hiro-v opened this issue 1 year ago • 4 comments

Objectives

  • Allow users to insert images and generate responses using the LLaVa model.
  • Enhance user experience by providing a seamless interface for image insertion and prompt creation.

Leads

  • Product: @imtuyethan
  • Engineering: @vuonghoainam @louis-jan @urmauur @tikikun

User Stories

In Scope

  1. As a user, I want to easily insert images into Jan's interface, either by dropping files or using buttons:

    • Scenario: I aim to effortlessly insert images.
    • Acceptance Criteria: The interface should support drag-and-drop functionality or provide clear buttons for uploading images.
  2. As a user, I want a user-friendly interface to create text prompts for the LLaVa model linked to the inserted images:

    • Scenario: I seek a seamless way to input prompts for the LLaVa model connected to the inserted images.
    • Acceptance Criteria: The interface should offer a clear space to input text prompts directly related to the inserted images.
  3. As a user, I want to view and interact with LLaVa model responses based on the inserted images and associated prompts:

    • Scenario: After inserting images and prompts, I expect to access and interact with the LLaVa model's generated responses.
    • Acceptance Criteria: The generated responses should be visible and easily accessible within the interface, showcasing accurate output based on the inserted image and associated prompts.

Out-of-Scope

  • As a user, I want to see prompts suggestions based on the capabilities of the LLaVa model.
  • As a user, I want to attach many images at the same time.

Design Wireframes

Key Considerations

Figma link

Image

Engineering & Architecture

In Scope

  • For @vuonghoainam to input.
  • Nitro supports image capability @tikikun

Out-of-Scope

Tasklist

  • [ ] https://github.com/janhq/jan/pull/1023
  • [x] https://github.com/janhq/jan/issues/1049
  • [x] https://github.com/janhq/jan/issues/1088
  • [x] https://github.com/orgs/janhq/projects/5/views/26?pane=issue&itemId=47514666

Resources

https://twitter.com/nathanlands/status/1709539003312259172?s=46&t=osxIAvq8ztXuDbNAm11thA https://twitter.com/LMStudioAI/status/1734640355318944190

Out of scope

Nitro supports speech/hear capability

hiro-v avatar Oct 07 '23 03:10 hiro-v

FYI, archive Hiro's comment here:

Issues

https://github.com/orgs/janhq/projects/5/views/7?filterQuery=milestone%3A%22Jan+can+See%22

Problem

  • I want to see Jan supports plugins for Computer vision models but for augmenting the current capabilities (think with LLM)
  • See https://twitter.com/nathanlands/status/1709539003312259172?s=46&t=osxIAvq8ztXuDbNAm11thA

Success Criteria

  • ADR
  • Some plugins

Additional context None

Tasks

  • [ ] https://github.com/janhq/jan/pull/1023
  • [x] https://github.com/janhq/jan/issues/1049
  • [x] https://github.com/janhq/jan/issues/1088
  • [ ] https://github.com/orgs/janhq/projects/5/views/26?pane=issue&itemId=47514666

imtuyethan avatar Dec 18 '23 13:12 imtuyethan

imtuyethan avatar Dec 19 '23 04:12 imtuyethan

Can this be turned on for nightly/experimental? IF we have it then can we use it?

freelerobot avatar Feb 14 '24 08:02 freelerobot

Goal: turn it on for experimental/nightly

freelerobot avatar Feb 16 '24 03:02 freelerobot

Tested and looking good on Jan 0.4.7-304 ✅

Van-QA avatar Mar 11 '24 03:03 Van-QA