Maunz-Discord
Maunz-Discord copied to clipboard
Bump org.jsoup:jsoup from 1.17.2 to 1.18.1
Bumps org.jsoup:jsoup from 1.17.2 to 1.18.1.
Release notes
Sourced from org.jsoup:jsoup's releases.
jsoup-1.18.1
https://jsoup.org/news/release-1.18.1
Improvements
- Stream Parser: A
StreamParser
provides a progressive parse of its input. As eachElement
is completed, it is emitted via aStream
orIterator
interface. Elements returned will be complete with all their children, and an (empty) next sibling, if applicable. Elements (or their children) may be removed from the DOM during the parse, for e.g. to conserve memory, providing a mechanism to parse an input document that would otherwise be too large to fit into memory, yet still providing a DOM interface to the document and its elements. Additionally, the parser provides aselectFirst(String query)
/selectNext(String query)
, which will run the parser until a hit is found, at which point the parse is suspended. It can be resumed via anotherselect()
call, or via thestream()
oriterator()
methods. 2096- Download Progress: added a Response Progress event interface, which reports progress and URLs are downloaded (and parsed). Supported on both a session and a single connection level. 2164, 656
- Added
Path
accepting parse methods:Jsoup.parse(Path)
,Jsoup.parse(path, charsetName, baseUri, parser)
, etc. 2055- Updated the
button
tag configuration to include a space between multiple button elements in theElement.text()
method. 2105- Added support for the
ns|*
all elements in namespace Selector. 1811- When normalising attribute names during serialization, invalid characters are now replaced with
_
, vs being stripped. This should make the process clearer, and generally prevent an invalid attribute name being coerced unexpectedly. 2143Changes
- Removed previously deprecated internal classes and methods. 2094
- Build change: the built jar's OSGi manifest no longer imports itself. 2158
Bug Fixes
- When tracking source positions, if the first node was a TextNode, its position was incorrectly set to
-1.
2106- When connecting (or redirecting) to URLs with characters such as
{
,}
in the path, a Malformed URL exception would be thrown (if in development), or the URL might otherwise not be escaped correctly (if in production). The URL encoding process has been improved to handle these characters correctly. 2142- When using
W3CDom
with a custom output Document, a Null Pointer Exception would be thrown. 2114- The
:has()
selector did not match correctly when using sibling combinators (like e.g.:h1:has(+h2)
). 2137- The
:empty
selector incorrectly matched elements that started with a blank text node and were followed by non-empty nodes, due to an incorrect short-circuit. 2130Element.cssSelector()
would fail with "Did not find balanced marker" when building a selector for elements that had a(
or[
in their class names. And selectors with those characters escaped would not match as expected. 2146- Updated
Entities.escape(string)
to make the escaped text suitable for both text nodes and attributes (previously was only for text nodes). This does not impact the output ofElement.html()
which correctly applies a minimal escape depending on if the use will be for text data or in a quoted
... (truncated)
Changelog
Sourced from org.jsoup:jsoup's changelog.
1.18.1 (Pending)
Improvements
- Stream Parser: A
StreamParser
provides a progressive parse of its input. As eachElement
is completed, it is emitted via aStream
orIterator
interface. Elements returned will be complete with all their children, and an (empty) next sibling, if applicable. Elements (or their children) may be removed from the DOM during the parse, for e.g. to conserve memory, providing a mechanism to parse an input document that would otherwise be too large to fit into memory, yet still providing a DOM interface to the document and its elements. Additionally, the parser provides aselectFirst(String query)
/selectNext(String query)
, which will run the parser until a hit is found, at which point the parse is suspended. It can be resumed via anotherselect()
call, or via thestream()
oriterator()
methods. 2096- Download Progress: added a Response Progress event interface, which reports progress and URLs are downloaded (and parsed). Supported on both a session and a single connection level. 2164, 656
- Added
Path
accepting parse methods:Jsoup.parse(Path)
,Jsoup.parse(path, charsetName, baseUri, parser)
, etc. 2055- Updated the
button
tag configuration to include a space between multiple button elements in theElement.text()
method. 2105- Added support for the
ns|*
all elements in namespace Selector. 1811- When normalising attribute names during serialization, invalid characters are now replaced with
_
, vs being stripped. This should make the process clearer, and generally prevent an invalid attribute name being coerced unexpectedly. 2143Changes
- Removed previously deprecated internal classes and methods. 2094
- Build change: the built jar's OSGi manifest no longer imports itself. 2158
Bug Fixes
- When tracking source positions, if the first node was a TextNode, its position was incorrectly set to
-1.
2106- When connecting (or redirecting) to URLs with characters such as
{
,}
in the path, a Malformed URL exception would be thrown (if in development), or the URL might otherwise not be escaped correctly (if in production). The URL encoding process has been improved to handle these characters correctly. 2142- When using
W3CDom
with a custom output Document, a Null Pointer Exception would be thrown. 2114- The
:has()
selector did not match correctly when using sibling combinators (like e.g.:h1:has(+h2)
). 2137- The
:empty
selector incorrectly matched elements that started with a blank text node and were followed by non-empty nodes, due to an incorrect short-circuit. 2130Element.cssSelector()
would fail with "Did not find balanced marker" when building a selector for elements that had a(
or[
in their class names. And selectors with those characters escaped would not match as expected. 2146- Updated
Entities.escape(string)
to make the escaped text suitable for both text nodes and attributes (previously was only for text nodes). This does not impact the output ofElement.html()
which correctly applies a minimal escape depending on if the use will be for text data or in a quoted attribute. 1278
... (truncated)
Commits
19e8539
[maven-release-plugin] prepare release jsoup-1.18.1c8b6f2e
Progress javadoc tweaks6cbe7e4
Replace attribute invalid characters with_
, vs stripping68f6f9c
Bump jetty.version from 9.4.54.v20240208 to 9.4.55.v20240627 (#2168)6423e65
Relaxed the multi-thread w/o newRequest test6c55f01
Bump org.codehaus.mojo:animal-sniffer-maven-plugin from 1.23 to 1.24 (#2167)e1bfee9
Shhb4b3fd1
Added test of partial fetch in Stream Parser9ba6dc7
Make Entities.escape(string) suitable for both text and attributesa0537c7
Handle escaped characters in consumeSubQuery- Additional commits viewable in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
-
@dependabot rebase
will rebase this PR -
@dependabot recreate
will recreate this PR, overwriting any edits that have been made to it -
@dependabot merge
will merge this PR after your CI passes on it -
@dependabot squash and merge
will squash and merge this PR after your CI passes on it -
@dependabot cancel merge
will cancel a previously requested merge and block automerging -
@dependabot reopen
will reopen this PR if it is closed -
@dependabot close
will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually -
@dependabot show <dependency name> ignore conditions
will show all of the ignore conditions of the specified dependency -
@dependabot ignore this major version
will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this minor version
will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this dependency
will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)