kiwix-js icon indicating copy to clipboard operation
kiwix-js copied to clipboard

Consider adding support for calling a local (or remote) LLM API to manipulate article contents

Open Jaifroid opened this issue 3 months ago • 4 comments

This is highly speculative in terms of usefulness, and the UI would need to be considered carefully. Use case would be for summarizing articles retrieved from the ZIM. Over time, it might be possible to allow a local LLM to use the app as a research tool, hence providing a natural-language interface to informational ZIM contents.

It would be relatively easy to make the calls in JS. We would just need to use the Hugging Face Agents library. Something like (for the Antrhopic API):

npm install @huggingface/agents

import { AnthropicAgent } from "@huggingface/agents";
const ANTHROPIC_API_KEY = "YOUR_API_KEY";
const agent = new AnthropicAgent(ANTHROPIC_API_KEY);
const prompt = "Summarize the following article in no more than 500 words: \n\n" + articleIframe.textContent;
const generatedCode = await agent.generateCode(prompt);
const evaluationResult = await agent.evaluateCode(generatedCode);
const messages = evaluationResult.messages;

(N.B. Untested)

To use this offline, the user would need to run a local LLM using kobold.cpp or possibly Mozilla's llamafile, and set up an API key, which they would need to provide to the Kiwix app. Ergo, only a solution for enthusiasts and tinkerers.

To provide the LLM in-app would require running a WASM inferencer such as https://github.com/mlc-ai/web-llm. But to support a model with large-enough context to ingest a Wikipedia article pulled from the ZIM would likely need a PC with a graphics card (and would only work on Chrome) or an Apple M1 Pro.

Would this be useful, or would it just be bloat?

Jaifroid avatar Apr 02 '24 15:04 Jaifroid

Someone also posted this: https://gist.github.com/hyrumsdolan/2aa3338f3005e9b468ff350c8f5929d9 (this one is specifically for using Claude AI in JS).

Jaifroid avatar Apr 10 '24 13:04 Jaifroid

I don't know if you've seen the recent release of the new Llama3 models, but they're much better than the previous models, and beat models many times their size; the 8B one in particular performs very efficiently.

If we made some way to perhaps spawn in the model remotely and connect to that (maybe via some other external application that lets you do that, I know there are some interesting projects), then we can let them use it to for example summarize web pages. This would, as you said, require a pretty powerful machine though, so it would be more for users who have the luxury.

We can also perhaps allow for using API keys to other types of models that have such a thing - chatgpt comes to mind.

As for the UI, shouldn't be too difficult to make something out-of-the way, that doesn't interfere unless interacted with, like a floating chat button or a button in the navbar.

D3V-D avatar May 04 '24 18:05 D3V-D

Yes, but at this stage I think any work on this wouldn't be for merging, it would merely be proof of concept, because there has to be agreement within the org as to which direction they want to go in with AI integration (if at all). Personally, I think that for JS apps, the best integration would be with https://webllm.mlc.ai/, which is a WASM inference that works very well with a Llama 3 8b Instruct Q4 model that, as you say, is really impressive for its size (about 4GB). In the context of full English Wikipedia, 4GB isn't too bad to gain natural-language search capability.

I envisage one use being that the AI could be instructed to come up with search terms for a vague user query that would link to articles in the ZIM. Here is an experiment I did a few days ago (reverse search engine - screenshot below). The idea is that the terms in square brackets would be links to the relevant Wikipedia article for more details. Problem is ensuring the search terms it comes up with are actually in the ZIM!

image

Jaifroid avatar May 05 '24 11:05 Jaifroid

Yeah, makes sense.

Also, that's a great demo; I think search could be a useful feature, but hallucination would be an issue - we would prob need to do something like run its response through some other code that then filters out non existent pages.

D3V-D avatar May 05 '24 11:05 D3V-D