Accessibility tree/snapshot for a specific DOM element, including whole hierarchy below it (custom "root" element)
Is your feature request related to a problem? Please describe.
With #363, we have the possibility to extract the whole structure and whole structure alone.
This is great, but can also cause context "overflows" or extreme token usage for complex pages where we perhaps just wanted to check a specific element or a sub-tree of specific elements.
Describe the solution you'd like
As discussed in https://github.com/ChromeDevTools/chrome-devtools-mcp/issues/363#issuecomment-3456531920
To make things flexible and effective, I suggest to have the possibility to "extract" only a sub-tree, based on a specific root element that I can define.
For example - if I want to just get the navigation and not the whole page, I could use the nav#main-nav or just #main-nav as a root and the MCP would return nav all elements under it.
That would help me to isolate the results when needed and saved on time, tokens and other resources.
Describe alternatives you've considered
Surely, we can post-process the whole tree after it is extracted. But that means that we always use a lot of time and tokens and can sometimes get to the limits of context window etc. Especially for smaller LLMs.
Additional context
In my experience, as a developer, I often work on components in isolation and less in the context of a full page. This means that getting whole tree all of the time is wasting time and resources and making me less effective.
I guess that most of developers work more on single components than on whole pages, so I think that having the isolated results for a specific component is way more effective. Especially on larg(er) sites with lots of components.
Example of prompt is : get snapshot for #main-nav which would then return (pseudo):
uid=1_0 navigation
uid=1_1 ul
uid=1_2 li
uid=1_3 link "Home" current page
...
cc @natorion w.r.t. the token reduction
I think it makes sense to have something like this. I wonder how an AI agent would make use of it though. They would need to be instructed to prefer this mechanism.
@BogdanCerovac can you share some more concrete use cases and example prompts you are using?
@natorion - leaving AI to interpret conversion of DOM structure (HTML + CSS + JS layers together) to accessibility/semantic structure (I like to use the term accessibility tree) is prone to misinterpretation, deprecated info, and obviously, like always to hallucinations etc. Having MCPs to provide the reality - including automatic accessibility testing tools (old school, static analysis) -
and an MCP that provides semantic structure (the motivation for this exact (sub-)feature (the whole tree is now possible already thanks to you folks, thanks)) adds to best context for AI possible (and also further automation ofc.).
Getting whole tree of a complex application (or page) is not efficient if we just work on a single sub-tree (often the case for developers). Sure, we can make own abstractions and regexes etc. to derive the info from the whole tree, but it is not optimal and prone to other issues.
2 use cases come to mind, but this opens for a lot more, I believe:
-
as mentioned - AI augmented development with classical static analysis and lint + this (and ideally RAG),
get snapshot of #header-nav a.current>aria-current page potentially missing, added>get snapshot of #header-nav a.currentfixed state -
automatic testing & remediation suggestion of complex flows (end to end with states of the app) - where AI can try to fix things a bit more effective due to proper context),
get snapshot of #header-nav .hamburger>potential button detected, missing role and name, fixing...>get snapshot of #header-nav .hamburger-fixed, div changed to button, added screen reader only label, TODO checking design MCP for design tokens...
Basically having the Puppeteer root feature directly in the MCP, to save the tokens and the parsing of whole tree to find our desired part.