MDN can now automatically lie to people seeking technical information
Summary
MDN's new "ai explain" button on code blocks generates human-like text that may be correct by happenstance, or may contain convincing falsehoods. this is a strange decision for a technical reference.
URL
https://developer.mozilla.org/en-US/docs/Web/CSS/grid
Reproduction steps
as soon as i heard about this, i visited the first MDN article in my address bar history (for the grid property), hit "ai explain" on the first code block encountered (the syntax summary), and received the following information:
grid: "a" 100px "b" 1fr;: This value sets the grid template to have two rows and two columns. The first row has a height of 100 pixels and the second row has a height of 1 fraction unit (1fr). The columns are named "a" and "b".
which is deeply but subtly incorrect — this creates only one column (more would require a slash), and the quoted strings are names of areas, not columns. but it's believable, and it's interwoven with explanations of other property values that are correct. this is especially bad since grid is a complex property with a complex shorthand syntax — exactly the sort of thing someone might want to hit an "explain" button on.
the generated text appears to be unreviewed, unreliable, unaccountable, and even unable to be corrected. at least if the text were baked into a repository, it could be subject to human oversight and pull requests, but as best i can tell it's just in a cache somewhere? it seems like this feature was conceived, developed, and deployed without even considering that an LLM might generate convincing gibberish, even though that's precisely what they're designed to do.
and far from disclaiming that the responses might be confidently wrong, you have called it a "trusted companion". i don't understand this.
Expected behavior
i would like MDN to contain correct information
Actual behavior
MDN has generated a convincing-sounding lie and there is no apparent process for correcting it
Device
Desktop
Browser
Firefox
Browser version
Stable
Operating system
Linux
Screenshot
No response
Anything else?
No response
Validations
- [X] I have read the Community Participation Guidelines.
- [X] I have verified that there isn't already an issue that reports the same bug to avoid creating a duplicate.
- [X] I have checked that this is a concrete bug. For Q&A open a GitHub Discussion.
Confirming. This "AI" snake oil is worse than useless for the reasons described above; other examples are trivial to create. It makes MDN worse just being there.
Generated code (without any human vetting for correctness and human curation for relevance) is a hazard since it can produce plausible-sounding disinformation.
I strongly feel that the AI help feature is likely to cause much more damage than it would possibly help.
Edit: To clarify, I think the best path forward is to offer only documentation written by humans, ideally reviewed by people who have domain expertise. E.g., pay technical writers and keep documentation available for community-suggested edits.
You don't explain code blocks using plausibly-sounding plain text, you explain it visually by linking to a JsFiddle page with that code.
This is a deeply misled feature that will produce disinformation at scale, like all other LLM applications.
You don't explain code blocks using plausibly-sounding plain text, you explain it visually by linking to a JsFiddle page with that code.
I don't agree that linking to a fiddle is enough, you need plain text explanation of what the code do. But the explanation need to be peer-reviewed and fact checked. Not being some plausible garbage generated by an AI.
and it's hardly a isolated occurrence. this is pervasively impossible to trust.
Releasing this alongside the long-awaited Play was also a choice. We should be empowering the community to create examples built on MDN itself.
As another example, https://developer.mozilla.org/en-US/docs/Web/HTML/Element/portal
From the page text (emphasis mine):
A
<portal>is similar to an<iframe>. An<iframe>allows a separate browsing context to be embedded. However, the embedded content of a<portal>is more limited than that of an<iframe>. It cannot be interacted with, and therefore is not suitable for embedding widgets into a document. Instead, the<portal>acts as a preview of the content of another page. It can be navigated into therefore allowing for seamless transition to the embedded content.
The "AI Explain" feature:
<portal id="exampleportal" src="https://example.com/"></portal>
The code example you provided is an HTML code snippet that demonstrates the usage of the
element.
The
<portal>element is a new HTML element introduced in the HTML5.3 specification. It allows you to embed a remote document into your current document, creating a portal-like experience. In the code example, there is a<portal>element with an id attribute set to "exampleportal" and a src attribute set to "https://example.com/". The id attribute is used to uniquely identify the portal element, while the src attribute specifies the URL of the remote document that will be displayed within the portal. When the page is loaded, the browser will fetch the remote document specified by the src attribute and display it within the<portal>element. The remote document can be any valid HTML document, and it can contain interactive content such as forms, videos, or other web components. The<portal>element provides a way to seamlessly integrate remote content into your web page, allowing users to interact with it as if it were part of the current document. This can be useful for scenarios such as displaying live data from another website or embedding a third-party widget into your page. It's important to note that the<portal>element is still an experimental feature and may not be supported by all browsers. Therefore, it's recommended to check the browser compatibility before using it in production.
In this case, the AI explanation directly contradicts the page itself!
As someone who occasionally dips his toes in CSS when no one else is available and a thing needs to be fixed, I depend on these docs to be dependable. This is even more true today than it was in the past, as LLM generated chum spreads across the web. Please keep MDN as a trusted source of human generated developer documentation.
This is both sad and deeply embarassing for the MDN team...
The inclusion of this feature represents a great deal of risk in folks' ability to learn code effectively, especially code where the output explanation has the potential to perpetuate bias and misunderstanding from the content the LLM model is sourced from.
I would also like to note the ethical and environmental concerns surrounding how LLMs are constructed. It saddens me to see this feature as a former MDN editor.
I didn't spend a decade trying to convince people to use MDN over the shovelfuls of low-quality SEO-farming craptext on W3Schools, only for them to be presented with shovefuls of low-quality AI craptext on MDN
The next generation of AI will be trained on this. Just sayin'...
Considering the fact that MDN's "AI Help" feature is a semi-paid service, this is a huge let down to both see and use.
This new feature claims to be powered by OpenAI's GPT 3.5, yet ChatGPT is purely a language model, not a knowledge model. Its job is to generate outputs that seem like they were written by a human, not be right about everything.
In the context of web development as a whole, we cannot count on LLM's to "facilitate our learning". I cannot understate how terrible and drastic this blow to customer trust is. ❌
MDN has been one of the leading resources for aspiring and current professional developers in the web world. This new beta "help" feature is taking away from the integrity and trustworthiness of a once fantastic site to learn from.
Thank you OP for opening this issue, MDN's team needs to be better.
I use MDN because it's a comprehensive and accurate source of documentation with no fluff. I fail to see how LLM output prone to egregious inaccuracies improves that. It dramatically weakens my confidence in MDN and I fear that its inclusion will promote an over-reliance on cheap but unreliable text generation.
We've come full circle and we've learned nothing.
Another example from the Accessibility concerns section of <s>: The Strikethrough element which offers this CSS:
s::before,
s::after {
clip-path: inset(100%);
clip: rect(1px, 1px, 1px, 1px);
height: 1px;
overflow: hidden;
position: absolute;
white-space: nowrap;
width: 1px;
}
s::before {
content: " [start of stricken text] ";
}
s::after {
content: " [end of stricken text] ";
}
The AI wraps up its explanation with this:
Overall, this code creates a strikethrough effect by hiding the content of the "s" element and adding visible text before and after it.
That is demonstrably wrong. There is no demo of that code showing it in action. A developer who uses this code and expects the outcome the AI said to expect would be disappointed (at best).
That was from the very first page I hit that had an accessibility note. Which means I am wary of what genuine user-harming advice this tool will offer on more complex concepts than simple stricken text.
To @aardrian's point: Utilizing inaccessible code may have legal ramifications, to say nothing about the ethical problems of restricting others' access. What risk and responsibilities does the MDN incur if an organization incorporates inaccessible code suggestions and advice provided by this feature?
As a person working towards becoming a web developer, I trust MDN to contain accurate, fact-checked information. For every minute this may save someone, it would surely cost hours of troubleshooting for another, especially newer developers who utilize MDN as a learning and reference tool extensively. This is damaging both to the developer community and the reputation of MDN as a trusted resource; while I might not have extensive experience as a web developer, I hope that a newbie perspective might also be helpful.
Deciding to implement this feature implies a fundamental misunderstanding about what LLMs do. MDN users are looking for authoritative, correct information, not for plausible-looking autogenerated fiction. This puts the good judgment of MDN's team in question.
I am warning my team about this feature and letting them know not to trust it.
This feature does not seem to be well-targeted at the problem it is meant to solve. Writing technical documentation is time-consuming and difficult, so wanting to automate is understandable – but the target audience are precisely those people who do not have the requisite knowledge to spot mistakes, so the "Was this answer useful?" feedback buttons don't seem likely to weed out bad explanations quickly or reliably enough to avoid problems.
There is already some work done on reasoning about where and how to automate tasks appropriately and effectively, and I recommend using it as a starting point for designing features like this. It may be more appropriate in this case, for example, to build a tool at Sheridan and Verplank's LOA 3 by using AI to generate text assets which are then reviewed and edited by a human expert before publication.
Placing GPT-based generations on a website that used to be for accurate documentation is so incredibly off-brand that I find it just...confusing. Newbies will find this, they will use this, and they will be fed misinformation that they cannot reasonably be expected to discern.
There's nothing to really be gained by this feature, it just smells like chasing trends with no thoughts given to the actual downsides. Not even to mention the legal issues that stem from generations of code matching public licensed code, which remains an unsolved problem.
It is beyond bizarre that I will now have to recommend people avoid MDN and use w3schools instead.
This from the <mark> element page gets the same CSS concept wrong in a fun new way.
mark::before,
mark::after {
clip-path: inset(100%);
clip: rect(1px, 1px, 1px, 1px);
height: 1px;
overflow: hidden;
position: absolute;
white-space: nowrap;
width: 1px;
}
mark::before {
content: " [highlight start] ";
}
mark::after {
content: " [highlight end] ";
}
From the fake-AI:
Overall, this code example creates a highlight effect by using pseudo-elements to add invisible elements before and after the content of the
<mark>element. These invisible elements are positioned absolutely and have a small size, effectively hiding them from view. The content property is used to add visible text before and after the<mark>element's content, creating the highlight effect.
The essentially same code from the <del> element page gets this explanation:
Overall, this code example creates a visual representation of a deleted text by hiding the content of the
<del>element and adding " [deletion start] " before the hidden content and " [deletion end] " after the hidden content.
I will spare you the same advice for the same code on the <ins> page.
The point is, the LLM in use does not understand CSS. Nor accessibility.
i mean at this stage, should at the very least add a big fat "this explanation may actually be complete toss" warning in front of it. or, you know, reevaluate what the actual point of having this "feature" is if it's a crap-shoot whether it's useful or just a pile of hallucinated rubbish
What is this feature even meant to offer? It's taking documentation and examples authored by thinking human beings who are capable of comprehending things, and bolting on clumsily-generated nonsense written by an uncomprehending automaton. That is: there's already an explanation; the entire page is an explanation; and if that explanation is insufficient, it should be edited by a thinking human being to improve it; sloppily bolting an AI-generated addendum onto it is not the right approach.
Even just looking at more of the code blocks on the article for grid: I clicked "AI Explain" on the HTML for one of the examples -- a code block with a #container element and several empty divs. Predictably, the LLM spat out three or four paragraphs of "middle schooler padding for word count"-tier dross about how the example "demonstrates how to add child elements," because the LLM couldn't comprehend the context of the code block. It couldn't and didn't properly explain the code block in the context of the grid property, the broader thing that that HTML was meant to demonstrate. "The HTML was setting up a single grid container, and a set of divs that would be rendered as colored blocks to visually illustrate the grid layout." If an explanation is actually necessary, that's a proper explanation.
Everything about this is blatantly, obviously wrong. What understanding of LLMs and of documentation as a concept could possibly lead to someone thinking this is a good idea?
What understanding of LLMs and of documentation as a concept could possibly lead to someone thinking this is a good idea?
the thinking of "actual human writers are expensive (if actually employed) ... we can save money through the power of AI"
I view this "feature" as a fundamental betrayal of the core MDN mission. That this "feature" would make it past the concept stage to even begin implementation demonstrates either a total lack of understanding of how LLM machine learning works and what "genAI" is capable of and/or a total disregard of MDN's mission. Having either of those happen in the process from concept to implementation is a complete failure.
By implementing and deploying this "feature", MDN has convinced me to stop contributing to MDN and cease donating to the Mozilla Foundation, because I am completely unwilling to participate in perpetuating the massive disinformation which this "feature" presents to users and the dramatic confusion and waste of people's time which it will cause.
Obviously, I will also stop recommending MDN as a good source of documentation. I will also need to remove links to MDN from everything I've written which can be edited.
I am so very, very disappointed in Mozilla/MDN.
This was very disappointing as a now-former MDN contributor and subscriber. The whole point of MDN was authoritative content but until there are some fundamental improvements in LLMs I might as well be supporting W3 Schools.
Can confirm, the "AI" buttons seem to be on every page on the site.
Aside from the ethical, legal and reputational issues here— practically speaking, until I have been assured that all "AI" integration and content has been permanently removed from MDN, I cannot trust, or use, MDN for any purpose. If you put it in one place, how do I know you have not put it in another? The "AI" corruption is already interleaved with the content. Currently it seems you have to click in specific marked places to get the "AI" content to generate, but how can I be sure that this will remain the case in future?
The function that MDN serves for me is to be an authoritative source. Lots of websites can tell me how to use, I dunno, the vertical-align attribute, but (other than the spec itself, which is not always practical as a day-to-day reference) developer.mozilla is the one place I can go to look something up and know that it is unequivocally accurate and grounded in the spec and browser practice. If developer.mozilla is now to be an admixture of verified information and speech-like strings randomly generated by a text statistical model, then developer.mozilla no longer serves that function (being an authoritative source). Either I have to double-check what I'm reading or I don't.