vscode icon indicating copy to clipboard operation
vscode copied to clipboard

Draft an API for computing embedding

Open jrieken opened this issue 1 year ago • 3 comments

Being able to compute embeddings is often important when dealing with NLP tasks. There are plenty of models out there and we should have API to consume/provide embedding computation logic into VS Code

jrieken avatar May 06 '24 09:05 jrieken

@jrieken any plan to make this embeddings api included in the release?

qiaoleiatms avatar Sep 05 '24 07:09 qiaoleiatms

Like I mentioned in #237769, having an API for embeddings would be also helpful for local semantic search and RAG scenarios.

waldekmastykarz avatar Jan 13 '25 09:01 waldekmastykarz

One example of a scenario that would benefit from this is building an extension that helps developers build on a specific platform, such as Microsoft 365, SAP or Salesforce to name a few. Typically, developing on a platform requires specific knowledge and information on that platform changes regularly, which means that the information included in the language model is typically outdated. By using RAG based on the latest documentation, developers get to benefit from the latest information and get contextually-relevant answers. If extension developers don't need to stand up their own embeddings API, it would make it easier to build extensions with pre-computed vector DBs with the latest knowledge, which the can be used fully locally to provide relevant answers.

waldekmastykarz avatar Jan 15 '25 12:01 waldekmastykarz

I'm developing a chat participant for my vs code extension and expected the models endpoint to include an embeddings API.

Use case is effectively a RAG pattern:

  • I precompute embedding vectors for all the paragraphs in my documentation for the extension
  • User asks a question about how to use a specific feature
  • Question is converted to an embedding using an embeddings endpoint
  • Use a similarity calculation to fetch the closest matches in the docs
  • Results are sent back to chat model to reply to the question

I would need to dictate the embeddings model and vector length via the API.

tonybaloney avatar Jan 24 '25 01:01 tonybaloney

I am also developing a chat participant for my vs code extension Peacock. I expected to either be able to have an API that pulled in docs from a source (like GitHub or a URL to crawl) .... or ... have the ability to use an in memory vector database and integrate it with the chat participant. Neither seems to work well today.

Use case is similar to what @tonybaloney says above.

Another challenge is if we use an AOAI or OAI service and need a key, it's not clear how to use that key without including it in the vsix. Since there is no server for this node.js app (the vsix runs local to the machine) where to put ENV variables needs guidance.

I believe it would be infinitely better to have an API that automatically integrates docs for the extension, as this owuld be a common request for all extensions. It would also improve the quality and experience of using extensions in VS Code across the board.

Example of proposed API:

myChatParticipant.documentationRepo = 'github url goes here';

johnpapa avatar Jan 24 '25 14:01 johnpapa

I am also developing a chat participant for my vs code extension Peacock. I expected to either be able to have an API that pulled in docs from a source (like GitHub or a URL to crawl) .... or ... have the ability to use an in memory vector database and integrate it with the chat participant. Neither seems to work well today.

Use case is similar to what @tonybaloney says above.

Another challenge is if we use an AOAI or OAI service and need a key, it's not clear how to use that key without including it in the vsix. Since there is no server for this node.js app (the vsix runs local to the machine) where to put ENV variables needs guidance.

I believe it would be infinitely better to have an API that automatically integrates docs for the extension, as this owuld be a common request for all extensions. It would also improve the quality and experience of using extensions in VS Code across the board.

Example of proposed API:

myChatParticipant.documentationRepo = 'github url goes here';

That is exactly what I need and what I would like to implement. In my case I was thinking about local docs (*.md files) but in reality these docs are available in one of our repositories, so if it could access the online version it would be even better.

In a future version I would like to retrieve information from API services (like Slack, JIRA, etc...)

Do we know if and when this API is going to be included in the stable version of VSCode?

andreagrandi avatar Feb 08 '25 09:02 andreagrandi