LibreChat
LibreChat copied to clipboard
Enhancement: Improve Browser Plugin
Contact Details
What features would you like to see added?
I would like to suggest the addition of a browser plugin for LibreChat that is capable of crawling websites, even when their content is loaded using JavaScript. Similar to the WebPilot plugin in ChatGPT, this plugin would utilize Puppeteer, a library that functions like a browser.
More details
I believe that such a plugin and the ability to upload documents are essential for the project in order to increase its reach. These features would greatly enhance the capabilities of LibreChat and make it a better ChatGPT. The reaction of users on social media clearly demonstrates the importance of this browser plugin, as they actually express their disappointment when it is disabled.
Which components are impacted by your request?
General
Pictures
No response
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
I understand. My initial thoughts are that any javascript browser solution will have to work very similarly to the way it works with ChatGPT, it will have to be a specific mode as the LLM will need a wide variety of instructions on how to work with browsing pages. Essentially a mode that disables all other plugins as it needs a special suite of plugins.
Uploading documents is in the works here, there are a lot of exciting features to take on and I've been paving the way with under-the-hood changes to optimize for this.
Thank you for your work, I'm really looking forward to the first results with the documents.
The OpenAi with the browser can be different, the standard browser of langchain is unfortunately very simple and not really fun because many requests don't work.
Can the browser improvement item be put on the to-do list? It doesn't have to be implemented immediately, but it should be kept in mind.
Instead of using the regular WebBrowser tool in Langchain, you can also create your own tool using the DynamicTool.
Instead of having the code like this:
...
import { WebBrowser } from "langchain/tools/webbrowser";
...
const tools = [
new WebBrowser({ model, embeddings }),
];
...
You can solve the whole thing like this:
...
import puppeteer from "puppeteer";
import { DynamicTool } from "langchain/tools";
...
const tools = [
new DynamicTool({
name: "FETCH_WEBPAGE",
description: "call this to fetch a webpage. input should be the full URL of the webpage with https:// .",
func: async (url) => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(url, { waitUntil: "networkidle2" });
// look and remove eu cookie banner
const cookieBannerElements = await page.$x(
'//*[contains(@class, "cookie") or contains(@class, "banner") or contains(@id, "cookie") or contains(@id, "banner")]'
);
if (cookieBannerElements.length > 0) {
await cookieBannerElements[0].evaluate((element) => element.remove());
}
const content = await page.evaluate(() => {
return document.body.innerText;
});
await browser.close();
return content;
},
}),
];
...
You can still include the OpenAI embedding part; I've removed the entire HTML stuff just for simplicity.
heres an example example.zip
npm init
npm run test
Source: https://js.langchain.com/docs/modules/agents/tools/how_to/dynamic
Instead of using the regular WebBrowser tool in Langchain, you can also create your own tool using the DynamicTool.
Instead of having the code like this:
... import { WebBrowser } from "langchain/tools/webbrowser"; ... const tools = [ new WebBrowser({ model, embeddings }), ]; ...You can solve the whole thing like this:
... import puppeteer from "puppeteer"; import { DynamicTool } from "langchain/tools"; ... const tools = [ new DynamicTool({ name: "FETCH_WEBPAGE", description: "call this to fetch a webpage. input should be the full URL of the webpage with https:// .", func: async (url) => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto(url, { waitUntil: "networkidle2" }); // look and remove eu cookie banner const cookieBannerElements = await page.$x( '//*[contains(@class, "cookie") or contains(@class, "banner") or contains(@id, "cookie") or contains(@id, "banner")]' ); if (cookieBannerElements.length > 0) { await cookieBannerElements[0].evaluate((element) => element.remove()); } const content = await page.evaluate(() => { return document.body.innerText; }); await browser.close(); return content; }, }), ]; ...You can still include the OpenAI embedding part; I've removed the entire HTML stuff just for simplicity.
heres an example example.zip
npm init npm run testSource: https://js.langchain.com/docs/modules/agents/tools/how_to/dynamic
have you tested it? i think there may be several implementations of this already will have to look around
also if we look at the official chatgpt browsing instructions, it seems like it's just chatgpt primed with a few different functions it can use, so the most optimal would probably be a suite of browsing tools
I have already been able to test it successfully as standanlone and also made attempts to build a really good browser. That means I have built in functions such as
- automatic detection of cookie banners
- Text extraction without HTML
- Text extraction only if the div has more than 200 characters
- scroll to bottom
- etc
I can do that in here tomorrow. Unfortunately, the browser thing is annoying, I noticed it again today when I wanted to summarize a long Reddit post, but LibreChat was not able to crawl it.
I can do that in here tomorrow. Unfortunately, the browser thing is annoying, I noticed it again today when I wanted to summarize a long Reddit post, but LibreChat was not able to crawl it.
do you mean the current browser/native langchain solution? it would make sense since reddit uses JS heavily and raw html can't scrape it
I have already been able to test it successfully as standanlone and also made attempts to build a really good browser. That means I have built in functions such as
* automatic detection of cookie banners * Text extraction without HTML * Text extraction only if the div has more than 200 characters * scroll to bottom * etc
awesome! will you submit a PR?
https://github.com/danny-avila/LibreChat/pull/758