deno-dom
deno-dom copied to clipboard
attribute .href doesn't work
console.log(body);
const doc = new DOMParser().parseFromString(body, "text/html");
const links = [...doc.querySelectorAll('a.result-title')];
for (const link of links) {
const title = link.innerText;
const url = link.href;
console.log({ title, url });
}
does this library not support standard dom methods? link.href
should give me the href
attribute.
So far Deno DOM only implements the Element
class. .href
is a property of HTMLAnchorElement
, a more specific DOM element implementation, of which there are many, and I haven't got around to implementing yet. So for now you can use the getAttribute("href")
method of Element
.
ok no worries.
One complication here is that — in a browser — the document has a location
property that is used to resolve fully-qualified URLs when accessing properties like HTMLAnchorElement.href
.
When a document is parsed from an HTML string using a DOMParser instance, there's not a way to attach the location information to the resulting document with the current API.
This makes it non-trivial to get fully-qualified URLs from properties on elements within the trees of such parsed documents.
However, this is both desirable and a common task, so I want to share two workaround approaches for resolving URLs from anchor element href
attributes:
Functional approach
This is safer and has better type compatibility. Here's a commented example:
href-example.ts
:
import {
DOMParser,
type Element,
} from "https://deno.land/x/[email protected]/deno-dom-wasm.ts";
import { assert } from "https://deno.land/[email protected]/testing/asserts.ts";
/** Functional form of `element.href` */
function resolveHref(element: Element, url: string | URL): string | undefined {
const href = element.getAttribute("href");
// ^? const href: string | null
if (!href) return undefined;
return new URL(href, url).href;
}
function main() {
const url = new URL("https://example.com/page/hello");
// Imagine the following HTML came from a fetch request to the URL above:
// const html = await (await fetch(url)).text();
const html = `
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>hello world</title>
</head>
<body>
<h1>hello world</h1>
<a class="external" href="https://en.wikipedia.org/wiki/Hello">wikipedia</a>
<a class="relative" href="about">about</a>
<a class="root" href="/account">account</a>
</body>
</html>
`;
const document = new DOMParser().parseFromString(html, "text/html");
// ^? const document: HTMLDocument | null
assert(document, "The document could not be parsed");
for (const className of ["external", "relative", "root"]) {
const anchor = document.querySelector(`a.${className}`);
// ^? const anchor: Element | null
assert(anchor, "Anchor element not found");
const hrefRaw = anchor.getAttribute("href");
// ^? const hrefRaw: string | null
const href = resolveHref(anchor, url);
// ^? const href: string | undefined
console.log({ hrefRaw, href });
}
}
if (import.meta.main) main();
% deno run href-example.ts
{
hrefRaw: "https://en.wikipedia.org/wiki/Hello",
href: "https://en.wikipedia.org/wiki/Hello"
}
{ hrefRaw: "about", href: "https://example.com/page/about" }
{ hrefRaw: "/account", href: "https://example.com/account" }
Prototype manipulation
The previous approach could become tedious if there are lots of href
s that need to be accessed. This approach defines the href
property on the prototype of a created anchor element, setting its getter and setter at the time the document is parsed.
It allows for obtaining a URL string by directly accessing the href
property on an element (like in browser code), but requires using a type assertion when doing so.
href-hack.ts
:
import {
DOMParser,
type Element,
type HTMLDocument,
} from "https://deno.land/x/[email protected]/deno-dom-wasm.ts";
import { assert } from "https://deno.land/[email protected]/testing/asserts.ts";
type HrefAttr = { href: string };
type ElementWithHref = Element & Partial<HrefAttr>;
type HTMLDocumentWithHref = HTMLDocument & { location: HrefAttr };
function createDocumentWithHref(
html: string,
url: string | URL,
): HTMLDocumentWithHref {
const document = new DOMParser().parseFromString(html, "text/html");
assert(document, "The document could not be parsed");
(document as HTMLDocumentWithHref).location = new URL(url);
const elementProto = Object.getPrototypeOf(document.createElement("a"));
Object.defineProperty(elementProto, "href", {
configurable: true,
enumerable: false,
get() {
const baseUrl = this.ownerDocument?.location?.href as string | undefined;
const href = this.getAttribute("href") as string | null;
if (!(baseUrl && href)) return undefined;
return new URL(href, baseUrl).href;
},
set(url: string) {
this.setAttribute("href", url);
},
});
return document as HTMLDocumentWithHref;
}
function main() {
const url = new URL("https://example.com/page/hello");
// Imagine the following HTML came from a fetch request to the URL above:
// const html = await (await fetch(url)).text();
const html = `
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>hello world</title>
</head>
<body>
<h1>hello world</h1>
<a class="external" href="https://en.wikipedia.org/wiki/Hello">wikipedia</a>
<a class="relative" href="about">about</a>
<a class="root" href="/account">account</a>
</body>
</html>
`;
const document = createDocumentWithHref(html, url);
for (const className of ["external", "relative", "root"]) {
const anchor = document.querySelector(`a.${className}`);
// ^? const anchor: Element | null
assert(anchor, "Anchor element not found");
const hrefRaw = anchor.getAttribute("href");
// ^? const hrefRaw: string | null
// The `.href` property doesn't exist on type Element,
// so trying to access it will create a compiler diagnostic error:
//
// anchor.href;
// ~~~~
// Property 'href' does not exist on type 'Element'.deno-ts(2339)
// Instead, you must assert that the Element is type ElementWithHref:
const href = (anchor as ElementWithHref).href;
// ^ (property) href?: string | undefined
console.log({ hrefRaw, href });
}
}
if (import.meta.main) main();
% deno run href-hack.ts
{
hrefRaw: "https://en.wikipedia.org/wiki/Hello",
href: "https://en.wikipedia.org/wiki/Hello"
}
{ hrefRaw: "about", href: "https://example.com/page/about" }
{ hrefRaw: "/account", href: "https://example.com/account" }
Both approaches result in the same outputs.