unfurl icon indicating copy to clipboard operation
unfurl copied to clipboard

Extract rich metadata from URLs

Unfurl

NPM version NPM downloads Build status Build coverage

Extract rich metadata from URLs.

Installation

npm install @borderless/unfurl --save

Usage

Unfurl attempts to parse and extract rich structured metadata from URLs.

import { scraper, urlScraper } from "@borderless/unfurl";
import * as plugins from "@borderless/unfurl/dist/plugins";

Scraper

Accepts a request function and a list of plugins to use. The request is expected to return a "page" object, which is the same shape as the input to scrape(page).

const scrape = scraper({
  request,
  plugins: [plugins.htmlmetaparser, plugins.exifdata],
});

const res = await fetch("http://example.com"); // E.g. `popsicle`.

await scrape({
  url: res.url,
  status: res.status,
  headers: res.headers.asObject(),
  body: res.stream(), // Must stream the request instead of buffering to support large responses.
});

URL Scraper

Simpler wrapper around scraper that automatically makes a request(url) for the page.

const scrape = urlScraper({ request });

await scrape("http://example.com");

License

Apache 2.0