LibreChat icon indicating copy to clipboard operation
LibreChat copied to clipboard

Enhancement: Improve Browser Plugin

Open SphaeroX opened this issue 2 years ago • 9 comments

Contact Details

[email protected]

What features would you like to see added?

I would like to suggest the addition of a browser plugin for LibreChat that is capable of crawling websites, even when their content is loaded using JavaScript. Similar to the WebPilot plugin in ChatGPT, this plugin would utilize Puppeteer, a library that functions like a browser.

More details

I believe that such a plugin and the ability to upload documents are essential for the project in order to increase its reach. These features would greatly enhance the capabilities of LibreChat and make it a better ChatGPT. The reaction of users on social media clearly demonstrates the importance of this browser plugin, as they actually express their disappointment when it is disabled.

Which components are impacted by your request?

General

Pictures

No response

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

SphaeroX avatar Jul 04 '23 08:07 SphaeroX

I understand. My initial thoughts are that any javascript browser solution will have to work very similarly to the way it works with ChatGPT, it will have to be a specific mode as the LLM will need a wide variety of instructions on how to work with browsing pages. Essentially a mode that disables all other plugins as it needs a special suite of plugins.

danny-avila avatar Jul 04 '23 18:07 danny-avila

Uploading documents is in the works here, there are a lot of exciting features to take on and I've been paving the way with under-the-hood changes to optimize for this.

danny-avila avatar Jul 04 '23 18:07 danny-avila

Thank you for your work, I'm really looking forward to the first results with the documents.

The OpenAi with the browser can be different, the standard browser of langchain is unfortunately very simple and not really fun because many requests don't work.

Can the browser improvement item be put on the to-do list? It doesn't have to be implemented immediately, but it should be kept in mind.

SphaeroX avatar Jul 04 '23 21:07 SphaeroX

Instead of using the regular WebBrowser tool in Langchain, you can also create your own tool using the DynamicTool.

Instead of having the code like this:

...

import { WebBrowser } from "langchain/tools/webbrowser";

...

const tools = [
  new WebBrowser({ model, embeddings }),
];

...

You can solve the whole thing like this:

...

import puppeteer from "puppeteer";
import { DynamicTool } from "langchain/tools";

...

const tools = [
  new DynamicTool({
    name: "FETCH_WEBPAGE",
    description: "call this to fetch a webpage. input should be the full URL of the webpage with https:// .",
    func: async (url) => {
      const browser = await puppeteer.launch();
      const page = await browser.newPage();
      await page.goto(url, { waitUntil: "networkidle2" });

  // look and remove eu cookie banner
  const cookieBannerElements = await page.$x(
    '//*[contains(@class, "cookie") or contains(@class, "banner") or contains(@id, "cookie") or contains(@id, "banner")]'
  );
  if (cookieBannerElements.length > 0) {
    await cookieBannerElements[0].evaluate((element) => element.remove());
  }

      const content = await page.evaluate(() => {
        return document.body.innerText;
      });
      await browser.close();
      return content;
    },
  }),
];

...

You can still include the OpenAI embedding part; I've removed the entire HTML stuff just for simplicity.

heres an example example.zip

npm init
npm run test

Source: https://js.langchain.com/docs/modules/agents/tools/how_to/dynamic

SphaeroX avatar Jul 28 '23 11:07 SphaeroX

Instead of using the regular WebBrowser tool in Langchain, you can also create your own tool using the DynamicTool.

Instead of having the code like this:

...

import { WebBrowser } from "langchain/tools/webbrowser";

...

const tools = [
  new WebBrowser({ model, embeddings }),
];

...

You can solve the whole thing like this:

...

import puppeteer from "puppeteer";
import { DynamicTool } from "langchain/tools";

...

const tools = [
  new DynamicTool({
    name: "FETCH_WEBPAGE",
    description: "call this to fetch a webpage. input should be the full URL of the webpage with https:// .",
    func: async (url) => {
      const browser = await puppeteer.launch();
      const page = await browser.newPage();
      await page.goto(url, { waitUntil: "networkidle2" });

  // look and remove eu cookie banner
  const cookieBannerElements = await page.$x(
    '//*[contains(@class, "cookie") or contains(@class, "banner") or contains(@id, "cookie") or contains(@id, "banner")]'
  );
  if (cookieBannerElements.length > 0) {
    await cookieBannerElements[0].evaluate((element) => element.remove());
  }

      const content = await page.evaluate(() => {
        return document.body.innerText;
      });
      await browser.close();
      return content;
    },
  }),
];

...

You can still include the OpenAI embedding part; I've removed the entire HTML stuff just for simplicity.

heres an example example.zip

npm init
npm run test

Source: https://js.langchain.com/docs/modules/agents/tools/how_to/dynamic

have you tested it? i think there may be several implementations of this already will have to look around

danny-avila avatar Aug 04 '23 18:08 danny-avila

also if we look at the official chatgpt browsing instructions, it seems like it's just chatgpt primed with a few different functions it can use, so the most optimal would probably be a suite of browsing tools

danny-avila avatar Aug 04 '23 18:08 danny-avila

I have already been able to test it successfully as standanlone and also made attempts to build a really good browser. That means I have built in functions such as

  • automatic detection of cookie banners
  • Text extraction without HTML
  • Text extraction only if the div has more than 200 characters
  • scroll to bottom
  • etc

I can do that in here tomorrow. Unfortunately, the browser thing is annoying, I noticed it again today when I wanted to summarize a long Reddit post, but LibreChat was not able to crawl it.

SphaeroX avatar Aug 04 '23 18:08 SphaeroX

I can do that in here tomorrow. Unfortunately, the browser thing is annoying, I noticed it again today when I wanted to summarize a long Reddit post, but LibreChat was not able to crawl it.

do you mean the current browser/native langchain solution? it would make sense since reddit uses JS heavily and raw html can't scrape it

I have already been able to test it successfully as standanlone and also made attempts to build a really good browser. That means I have built in functions such as

* automatic detection of cookie banners

* Text extraction without HTML

* Text extraction only if the div has more than 200 characters

* scroll to bottom

* etc

awesome! will you submit a PR?

danny-avila avatar Aug 04 '23 20:08 danny-avila

https://github.com/danny-avila/LibreChat/pull/758

SphaeroX avatar Aug 05 '23 08:08 SphaeroX