Fix(html): Handle `<br>` elements to insert line breaks in text
Fixes #1090 by updating the DOM parser to handle <br> elements and insert line breaks (\n) when converting HTML content to plain text.
Initially, I thought adding a simple condition might not be a reliable solution. So, I decided to check how HTML-to-text conversion is handled in Chromium and found a similar approach. Here's the link.
- [x] I’ve reviewed the contributor guide and applied the relevant portions to this PR.
Contribution guidelines:
- See our contributor guide for general expectations for PRs.
- Larger or significant changes should be discussed in an issue before creating a PR.
- Contributions to our repos should follow the Dart style guide and use
dart format. - Most changes should add an entry to the changelog and may need to rev the pubspec package version.
- Changes to packages require corresponding tests.
Note that many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback.
PR Health
Breaking changes :warning:
| Package | Change | Current Version | New Version | Needed Version | Looking good? |
|---|---|---|---|---|---|
| html | Breaking | 0.15.6 | 0.15.7-wip | 0.16.0 Got "0.15.7-wip" expected >= "0.16.0" (breaking changes) |
:warning: |
This check can be disabled by tagging the PR with skip-breaking-check.
Changelog Entry :heavy_check_mark:
| Package | Changed Files |
|---|
Changes to files need to be accounted for in their respective changelogs.
This check can be disabled by tagging the PR with skip-changelog-check.
Coverage :heavy_check_mark:
| File | Coverage |
|---|---|
| pkgs/html/lib/dom.dart | :green_heart: 65 % :arrow_up: 1 % |
This check for test coverage is informational (issues shown here will not fail the PR).
This check can be disabled by tagging the PR with skip-coverage-check.
API leaks :warning:
The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.
| Package | Leaked API symbol | Leaking sources |
|---|---|---|
| html | HtmlTokenizer | html/parser.dart::HtmlParser::tokenizer |
| html | Token | tokenizer.dart::HtmlTokenizer tokenizer.dart::HtmlTokenizer::tokenQueue tokenizer.dart::HtmlTokenizer::currentToken tokenizer.dart::HtmlTokenizer::currentToken token.dart::TagToken token.dart::DoctypeToken token.dart::StringToken tokenizer.dart::HtmlTokenizer::current token.dart::StartTagToken token.dart::CommentToken html/parser.dart::Phase::processComment html/parser.dart::Phase::processDoctype token.dart::CharactersToken html/parser.dart::Phase::processCharacters token.dart::SpaceCharactersToken html/parser.dart::Phase::processSpaceCharacters html/parser.dart::Phase::processStartTag html/parser.dart::Phase::startTagHtml token.dart::EndTagToken html/parser.dart::Phase::processEndTag html/parser.dart::HtmlParser::inForeignContent::token html/parser.dart::HtmlParser::parseRCDataRawtext::token html/parser.dart::BeforeHeadPhase::startTagOther html/parser.dart::BeforeHeadPhase::endTagImplyHead html/parser.dart::InHeadPhase::startTagOther html/parser.dart::InHeadPhase::endTagHtmlBodyBr html/parser.dart::AfterHeadPhase::startTagOther html/parser.dart::AfterHeadPhase::endTagHtmlBodyBr html/parser.dart::InBodyPhase::startTagProcessInHead html/parser.dart::InBodyPhase::startTagButton html/parser.dart::InBodyPhase::startTagOther html/parser.dart::InBodyPhase::endTagHtml html/parser.dart::InTablePhase::startTagCol html/parser.dart::InTablePhase::startTagImplyTbody html/parser.dart::InTablePhase::startTagTable html/parser.dart::InTablePhase::startTagStyleScript html/parser.dart::InCaptionPhase::startTagTableElement html/parser.dart::InCaptionPhase::startTagOther html/parser.dart::InCaptionPhase::endTagTable html/parser.dart::InCaptionPhase::endTagOther html/parser.dart::InColumnGroupPhase::startTagOther html/parser.dart::InColumnGroupPhase::endTagOther html/parser.dart::InTableBodyPhase::startTagTableCell html/parser.dart::InTableBodyPhase::startTagTableOther html/parser.dart::InTableBodyPhase::startTagOther html/parser.dart::InTableBodyPhase::endTagTable html/parser.dart::InTableBodyPhase::endTagOther html/parser.dart::InRowPhase::startTagTableOther html/parser.dart::InRowPhase::startTagOther html/parser.dart::InRowPhase::endTagTable html/parser.dart::InRowPhase::endTagTableRowGroup html/parser.dart::InRowPhase::endTagOther html/parser.dart::InCellPhase::startTagTableOther html/parser.dart::InCellPhase::startTagOther html/parser.dart::InCellPhase::endTagImply html/parser.dart::InCellPhase::endTagOther html/parser.dart::InSelectPhase::startTagInput html/parser.dart::InSelectPhase::startTagScript html/parser.dart::InSelectPhase::startTagOther html/parser.dart::InSelectInTablePhase::startTagTable html/parser.dart::InSelectInTablePhase::startTagOther html/parser.dart::InSelectInTablePhase::endTagTable html/parser.dart::InSelectInTablePhase::endTagOther html/parser.dart::AfterBodyPhase::startTagOther html/parser.dart::AfterBodyPhase::endTagHtml::token html/parser.dart::AfterBodyPhase::endTagOther html/parser.dart::InFramesetPhase::startTagNoframes html/parser.dart::InFramesetPhase::startTagOther html/parser.dart::AfterFramesetPhase::startTagNoframes html/parser.dart::AfterAfterBodyPhase::startTagOther html/parser.dart::AfterAfterFramesetPhase::startTagNoFrames |
| html | HtmlInputStream | tokenizer.dart::HtmlTokenizer::stream |
| html | TagToken | tokenizer.dart::HtmlTokenizer::currentTagToken token.dart::StartTagToken token.dart::EndTagToken html/parser.dart::InTableBodyPhase::startTagTableOther::token html/parser.dart::InTableBodyPhase::endTagTable::token |
| html | DoctypeToken | tokenizer.dart::HtmlTokenizer::currentDoctypeToken treebuilder.dart::TreeBuilder::insertDoctype::token html/parser.dart::Phase::processDoctype::token |
| html | StringToken | tokenizer.dart::HtmlTokenizer::currentStringToken token.dart::StringToken::add treebuilder.dart::TreeBuilder::insertComment::token token.dart::CommentToken token.dart::CharactersToken token.dart::SpaceCharactersToken html/parser.dart::InBodyPhase::processSpaceCharactersDropNewline::token html/parser.dart::InTableTextPhase::characterTokens |
| html | TreeBuilder | html/parser.dart::HtmlParser::tree html/parser.dart::Phase::tree html/parser.dart::HtmlParser::new::tree |
| html | ActiveFormattingElements | treebuilder.dart::TreeBuilder::activeFormattingElements |
| html | StartTagToken | treebuilder.dart::TreeBuilder::insertRoot::token treebuilder.dart::TreeBuilder::createElement::token treebuilder.dart::TreeBuilder::insertElement::token treebuilder.dart::TreeBuilder::insertElementNormal::token treebuilder.dart::TreeBuilder::insertElementTable::token html/parser.dart::Phase::processStartTag::token html/parser.dart::Phase::startTagHtml::token html/parser.dart::HtmlParser::adjustMathMLAttributes::token html/parser.dart::HtmlParser::adjustSVGAttributes::token html/parser.dart::HtmlParser::adjustForeignAttributes::token html/parser.dart::BeforeHeadPhase::startTagHead::token html/parser.dart::BeforeHeadPhase::startTagOther::token html/parser.dart::InHeadPhase::startTagHead::token html/parser.dart::InHeadPhase::startTagBaseLinkCommand::token html/parser.dart::InHeadPhase::startTagMeta::token html/parser.dart::InHeadPhase::startTagTitle::token html/parser.dart::InHeadPhase::startTagNoScriptNoFramesStyle::token html/parser.dart::InHeadPhase::startTagScript::token html/parser.dart::InHeadPhase::startTagOther::token html/parser.dart::AfterHeadPhase::startTagBody::token html/parser.dart::AfterHeadPhase::startTagFrameset::token html/parser.dart::AfterHeadPhase::startTagFromHead::token html/parser.dart::AfterHeadPhase::startTagHead::token html/parser.dart::AfterHeadPhase::startTagOther::token html/parser.dart::InBodyPhase::addFormattingElement::token html/parser.dart::InBodyPhase::startTagProcessInHead::token html/parser.dart::InBodyPhase::startTagBody::token html/parser.dart::InBodyPhase::startTagFrameset::token html/parser.dart::InBodyPhase::startTagCloseP::token html/parser.dart::InBodyPhase::startTagPreListing::token html/parser.dart::InBodyPhase::startTagForm::token html/parser.dart::InBodyPhase::startTagListItem::token html/parser.dart::InBodyPhase::startTagPlaintext::token html/parser.dart::InBodyPhase::startTagHeading::token html/parser.dart::InBodyPhase::startTagA::token html/parser.dart::InBodyPhase::startTagFormatting::token html/parser.dart::InBodyPhase::startTagNobr::token html/parser.dart::InBodyPhase::startTagButton::token html/parser.dart::InBodyPhase::startTagAppletMarqueeObject::token html/parser.dart::InBodyPhase::startTagXmp::token html/parser.dart::InBodyPhase::startTagTable::token html/parser.dart::InBodyPhase::startTagVoidFormatting::token html/parser.dart::InBodyPhase::startTagInput::token html/parser.dart::InBodyPhase::startTagParamSource::token html/parser.dart::InBodyPhase::startTagHr::token html/parser.dart::InBodyPhase::startTagImage::token html/parser.dart::InBodyPhase::startTagIsIndex::token html/parser.dart::InBodyPhase::startTagTextarea::token html/parser.dart::InBodyPhase::startTagIFrame::token html/parser.dart::InBodyPhase::startTagRawtext::token html/parser.dart::InBodyPhase::startTagOpt::token html/parser.dart::InBodyPhase::startTagSelect::token html/parser.dart::InBodyPhase::startTagRpRt::token html/parser.dart::InBodyPhase::startTagMath::token html/parser.dart::InBodyPhase::startTagSvg::token html/parser.dart::InBodyPhase::startTagMisplaced::token html/parser.dart::InBodyPhase::startTagOther::token html/parser.dart::InTablePhase::startTagCaption::token html/parser.dart::InTablePhase::startTagColgroup::token html/parser.dart::InTablePhase::startTagCol::token html/parser.dart::InTablePhase::startTagRowGroup::token html/parser.dart::InTablePhase::startTagImplyTbody::token html/parser.dart::InTablePhase::startTagTable::token html/parser.dart::InTablePhase::startTagStyleScript::token html/parser.dart::InTablePhase::startTagInput::token html/parser.dart::InTablePhase::startTagForm::token html/parser.dart::InTablePhase::startTagOther::token html/parser.dart::InCaptionPhase::startTagTableElement::token html/parser.dart::InCaptionPhase::startTagOther::token html/parser.dart::InColumnGroupPhase::startTagCol::token html/parser.dart::InColumnGroupPhase::startTagOther::token html/parser.dart::InTableBodyPhase::startTagTr::token html/parser.dart::InTableBodyPhase::startTagTableCell::token html/parser.dart::InTableBodyPhase::startTagOther::token html/parser.dart::InRowPhase::startTagTableCell::token html/parser.dart::InRowPhase::startTagTableOther::token html/parser.dart::InRowPhase::startTagOther::token html/parser.dart::InCellPhase::startTagTableOther::token html/parser.dart::InCellPhase::startTagOther::token html/parser.dart::InSelectPhase::startTagOption::token html/parser.dart::InSelectPhase::startTagOptgroup::token html/parser.dart::InSelectPhase::startTagSelect::token html/parser.dart::InSelectPhase::startTagInput::token html/parser.dart::InSelectPhase::startTagScript::token html/parser.dart::InSelectPhase::startTagOther::token html/parser.dart::InSelectInTablePhase::startTagTable::token html/parser.dart::InSelectInTablePhase::startTagOther::token html/parser.dart::InForeignContentPhase::adjustSVGTagNames::token html/parser.dart::AfterBodyPhase::startTagOther::token html/parser.dart::InFramesetPhase::startTagFrameset::token html/parser.dart::InFramesetPhase::startTagFrame::token html/parser.dart::InFramesetPhase::startTagNoframes::token html/parser.dart::InFramesetPhase::startTagOther::token html/parser.dart::AfterFramesetPhase::startTagNoframes::token html/parser.dart::AfterFramesetPhase::startTagOther::token html/parser.dart::AfterAfterBodyPhase::startTagOther::token html/parser.dart::AfterAfterFramesetPhase::startTagNoFrames::token html/parser.dart::AfterAfterFramesetPhase::startTagOther::token |
| html | TagAttribute | token.dart::StartTagToken::attributeSpans |
| html | CommentToken | html/parser.dart::Phase::processComment::token |
| html | CharactersToken | html/parser.dart::Phase::processCharacters::token html/parser.dart::InTablePhase::insertText::token |
| html | SpaceCharactersToken | html/parser.dart::Phase::processSpaceCharacters::token |
| html | EndTagToken | html/parser.dart::Phase::processEndTag::token html/parser.dart::Phase::popOpenElementsUntil::token html/parser.dart::BeforeHeadPhase::endTagImplyHead::token html/parser.dart::BeforeHeadPhase::endTagOther::token html/parser.dart::InHeadPhase::endTagHead::token html/parser.dart::InHeadPhase::endTagHtmlBodyBr::token html/parser.dart::InHeadPhase::endTagOther::token html/parser.dart::AfterHeadPhase::endTagHtmlBodyBr::token html/parser.dart::AfterHeadPhase::endTagOther::token html/parser.dart::InBodyPhase::endTagP::token html/parser.dart::InBodyPhase::endTagBody::token html/parser.dart::InBodyPhase::endTagHtml::token html/parser.dart::InBodyPhase::endTagBlock::token html/parser.dart::InBodyPhase::endTagForm::token html/parser.dart::InBodyPhase::endTagListItem::token html/parser.dart::InBodyPhase::endTagHeading::token html/parser.dart::InBodyPhase::endTagFormatting::token html/parser.dart::InBodyPhase::endTagAppletMarqueeObject::token html/parser.dart::InBodyPhase::endTagBr::token html/parser.dart::InBodyPhase::endTagOther::token html/parser.dart::TextPhase::endTagScript::token html/parser.dart::TextPhase::endTagOther::token html/parser.dart::InTablePhase::endTagTable::token html/parser.dart::InTablePhase::endTagIgnore::token html/parser.dart::InTablePhase::endTagOther::token html/parser.dart::InCaptionPhase::endTagCaption::token html/parser.dart::InCaptionPhase::endTagTable::token html/parser.dart::InCaptionPhase::endTagIgnore::token html/parser.dart::InCaptionPhase::endTagOther::token html/parser.dart::InColumnGroupPhase::endTagColgroup::token html/parser.dart::InColumnGroupPhase::endTagCol::token html/parser.dart::InColumnGroupPhase::endTagOther::token html/parser.dart::InTableBodyPhase::endTagTableRowGroup::token html/parser.dart::InTableBodyPhase::endTagIgnore::token html/parser.dart::InTableBodyPhase::endTagOther::token html/parser.dart::InRowPhase::endTagTr::token html/parser.dart::InRowPhase::endTagTable::token html/parser.dart::InRowPhase::endTagTableRowGroup::token html/parser.dart::InRowPhase::endTagIgnore::token html/parser.dart::InRowPhase::endTagOther::token html/parser.dart::InCellPhase::endTagTableCell::token html/parser.dart::InCellPhase::endTagIgnore::token html/parser.dart::InCellPhase::endTagImply::token html/parser.dart::InCellPhase::endTagOther::token html/parser.dart::InSelectPhase::endTagOption::token html/parser.dart::InSelectPhase::endTagOptgroup::token html/parser.dart::InSelectPhase::endTagSelect::token html/parser.dart::InSelectPhase::endTagOther::token html/parser.dart::InSelectInTablePhase::endTagTable::token html/parser.dart::InSelectInTablePhase::endTagOther::token html/parser.dart::AfterBodyPhase::endTagOther::token html/parser.dart::InFramesetPhase::endTagFrameset::token html/parser.dart::InFramesetPhase::endTagOther::token html/parser.dart::AfterFramesetPhase::endTagHtml::token html/parser.dart::AfterFramesetPhase::endTagOther::token |
This check can be disabled by tagging the PR with skip-leaking-check.
License Headers :warning:
// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
| Files |
|---|
| pkgs/html/lib/dom.dart |
| pkgs/html/test/parser_feature_test.dart |
All source files should start with a license header.
Unrelated files missing license headers
| Files |
|---|
| pkgs/bazel_worker/benchmark/benchmark.dart |
| pkgs/benchmark_harness/integration_test/perf_benchmark_test.dart |
| pkgs/boolean_selector/example/example.dart |
| pkgs/clock/lib/clock.dart |
| pkgs/clock/lib/src/clock.dart |
| pkgs/clock/lib/src/default.dart |
| pkgs/clock/lib/src/stopwatch.dart |
| pkgs/clock/lib/src/utils.dart |
| pkgs/clock/test/clock_test.dart |
| pkgs/clock/test/default_test.dart |
| pkgs/clock/test/stopwatch_test.dart |
| pkgs/clock/test/utils.dart |
| pkgs/coverage/lib/src/coverage_options.dart |
| pkgs/html/example/main.dart |
| pkgs/html/lib/dom_parsing.dart |
| pkgs/html/lib/html_escape.dart |
| pkgs/html/lib/parser.dart |
| pkgs/html/lib/src/constants.dart |
| pkgs/html/lib/src/encoding_parser.dart |
| pkgs/html/lib/src/html_input_stream.dart |
| pkgs/html/lib/src/list_proxy.dart |
| pkgs/html/lib/src/query_selector.dart |
| pkgs/html/lib/src/token.dart |
| pkgs/html/lib/src/tokenizer.dart |
| pkgs/html/lib/src/treebuilder.dart |
| pkgs/html/lib/src/utils.dart |
| pkgs/html/test/dom_test.dart |
| pkgs/html/test/parser_test.dart |
| pkgs/html/test/query_selector_test.dart |
| pkgs/html/test/selectors/level1_baseline_test.dart |
| pkgs/html/test/selectors/level1_lib.dart |
| pkgs/html/test/selectors/selectors.dart |
| pkgs/html/test/support.dart |
| pkgs/html/test/tokenizer_test.dart |
| pkgs/html/test/trie_test.dart |
| pkgs/html/tool/generate_trie.dart |
| pkgs/pubspec_parse/test/git_uri_test.dart |
| pkgs/stack_trace/example/example.dart |
| pkgs/watcher/test/custom_watcher_factory_test.dart |
| pkgs/yaml_edit/example/example.dart |
This check can be disabled by tagging the PR with skip-license-check.
Hey, thanks for reviewing this! 🙌 It’s been a few months since I worked on it, and I was still getting familiar with the codebase at the time — so I’ll need to refresh myself on the changes. I'll take a look as soon as I can. Appreciate your feedback!
@Dhruv-Maradiya Just a friendly ping as I am looking through PRs - is there intention to land this?
Hey @mosuem, sorry for the delay! I’ll try to wrap this up ASAP, most likely today.
Friendly ping :) (No pressure, just happened to walk by this tab in my browser)
Package publishing
| Package | Version | Status | Publish tag (post-merge) |
|---|---|---|---|
| package:bazel_worker | 1.1.4 | already published at pub.dev | |
| package:benchmark_harness | 2.4.0-wip | WIP (no publish necessary) | |
| package:boolean_selector | 2.1.2 | already published at pub.dev | |
| package:browser_launcher | 1.1.3 | already published at pub.dev | |
| package:cli_config | 0.2.1-wip | WIP (no publish necessary) | |
| package:cli_util | 0.5.0-wip | WIP (no publish necessary) | |
| package:clock | 1.1.3-wip | WIP (no publish necessary) | |
| package:code_builder | 4.11.0 | already published at pub.dev | |
| package:coverage | 1.15.0 | already published at pub.dev | |
| package:csslib | 1.0.2 | already published at pub.dev | |
| package:extension_discovery | 2.1.0 | already published at pub.dev | |
| package:file | 7.0.2-wip | WIP (no publish necessary) | |
| package:file_testing | 3.1.0-wip | WIP (no publish necessary) | |
| package:glob | 2.1.3 | already published at pub.dev | |
| package:graphs | 2.3.3-wip | WIP (no publish necessary) | |
| package:html | 0.15.7-wip | WIP (no publish necessary) | |
| package:io | 1.1.0-wip | WIP (no publish necessary) | |
| package:json_rpc_2 | 4.0.0 | already published at pub.dev | |
| package:markdown | 7.3.1-wip | WIP (no publish necessary) | |
| package:mime | 2.0.0 | already published at pub.dev | |
| package:oauth2 | 2.0.4 | ready to publish | oauth2-v2.0.4 |
| package:package_config | 2.3.0-wip | WIP (no publish necessary) | |
| package:pool | 1.5.2 | already published at pub.dev | |
| package:process | 5.0.5 | already published at pub.dev | |
| package:pub_semver | 2.2.0 | already published at pub.dev | |
| package:pubspec_parse | 1.5.1-wip | WIP (no publish necessary) | |
| package:source_map_stack_trace | 2.1.3-wip | WIP (no publish necessary) | |
| package:source_maps | 0.10.14-wip | WIP (no publish necessary) | |
| package:source_span | 1.10.1 | already published at pub.dev | |
| package:sse | 4.1.8 | already published at pub.dev | |
| package:stack_trace | 1.12.1 | already published at pub.dev | |
| package:stream_channel | 2.1.4 | already published at pub.dev | |
| package:stream_transform | 2.1.2-wip | WIP (no publish necessary) | |
| package:string_scanner | 1.4.1 | already published at pub.dev | |
| package:term_glyph | 1.2.3-wip | WIP (no publish necessary) | |
| package:test_reflective_loader | 0.4.0 | already published at pub.dev | |
| package:timing | 1.0.2 | already published at pub.dev | |
| package:unified_analytics | 8.0.6 | ready to publish | unified_analytics-v8.0.6 |
| package:watcher | 1.1.5-wip | WIP (no publish necessary) | |
| package:yaml | 3.1.3 | already published at pub.dev | |
| package:yaml_edit | 2.2.2 | already published at pub.dev |
Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation.
@HosseinYousefi could you take another look?