[Enhancement] Robust Snapshot Architecture: Dual-Source Accessibility, Defensive Href Extraction & Download Link Detection

Open JaviMaligno opened this issue 2 weeks ago • 1 comments

Related Issues

#363 - Accessibility tree/element(s) snapshot
#284 - Download folder configuration for automation
Puppeteer #6311 - URL attribute for links

Problem Statement

While working on browser automation for AI agents, we identified several reliability gaps in snapshot/extraction that affect real-world usage:

Href extraction edge cases - Some dynamically-rendered links or SPAs don't expose href reliably through the accessibility tree alone
No proactive download identification - Agents must parse snapshot text manually to find downloadable files
Single-source accessibility - Relying solely on Puppeteer's snapshot can miss semantics (as discussed in #363)
Snapshot fragility - Individual element failures can break the entire snapshot

Proposed Improvements

We've implemented and deployed (Azure production) solutions for these:

1. Dual-Fallback Href Extraction

// Runtime.callFunctionOn with fallback
return this.href || this.getAttribute('href') || '';

This handles edge cases where the standard property read fails (related to Puppeteer #6311 discussion).

2. Explicit `downloadLinks` Field

interface SnapshotResult {
  // ... existing fields
  downloadLinks: Array<{
    url: string;
    filename: string;
    extension: string;
  }>;
}

Automatically identifies downloadable files by extension (.csv, .xlsx, .zip, .pdf, .json, etc.). Agents no longer need to parse text manually.

3. Dual-Source Accessibility Tree

Source	Purpose
Puppeteer `page.accessibility.snapshot()`	Semantic structure
CDP `backendNodeId`	Precise DOM element mapping

This addresses the gaps @BogdanCerovac identified in #363 - combining semantic accessibility with precise DOM mapping.

4. Resilient Error Handling

// Continue on individual element failures
for (const node of nodes) {
  try {
    await extractNodeData(node);
  } catch (e) {
    console.warn(`Skipping node: ${e.message}`);
    continue; // Don't fail entire snapshot
  }
}

Implementation

We have a working implementation deployed in production. Happy to:

[ ] Submit a PR with these improvements
[ ] Provide more technical details on any specific aspect
[ ] Discuss alternative approaches

Questions for Maintainers

Would you prefer these as separate PRs or one consolidated change?
For downloadLinks - should this be opt-in via a parameter or always included?
Any concerns about the dual-source approach adding complexity?

/cc @OrKoN

Dec 12 '25 16:12 JaviMaligno

Thanks for filing a feature request first. Generally, we would want to extend the snapshot with DOM data for some use cases like styling debugging, but we want to engineer it in a way so that the performance and stability does not degrade (e.g., by exposing required capabilities via CDP or using isolated worlds instead of script evaluations in the main world).

The use cases you describe though might not justify the added complexity yet. The goal of this MCP server is to aid debugging and testing of applications to ensure they work well and follow best practices. Therefore, my first suggestion would be to fix the apps in question and make the download links accessible. Second, teaching your agent to extract the links and hrefs the way you want using the generic script execution tool would be another alternative I would suggest. WDYT? cc @natorion

P.S. doing this node by node looks inefficient. We should at least pre-filter potential DOM nodes then collect data for all nodes with at most one script eval per script.

// Continue on individual element failures
for (const node of nodes) {
  try {
    await extractNodeData(node);
  } catch (e) {
    console.warn(`Skipping node: ${e.message}`);
    continue; // Don't fail entire snapshot
  }
}

Dec 13 '25 10:12 OrKoN

[Enhancement] Robust Snapshot Architecture: Dual-Source Accessibility, Defensive Href Extraction & Download Link Detection

Related Issues

Problem Statement

Proposed Improvements

1. Dual-Fallback Href Extraction

2. Explicit downloadLinks Field

3. Dual-Source Accessibility Tree

4. Resilient Error Handling

Implementation

Questions for Maintainers

2. Explicit `downloadLinks` Field