Fix setImageFromContent regex to extract full image uri
Issue
Because setImageFromContent was only matching URLs up to the file extension, it stripped off any query parameters. In practice this means:
- All Reddit feeds (e.g.
https://www.reddit.com/r/minecraft.rss) would fall back to a link without query string. - In the official demo app PR—and in any client application using this library—those links no longer resolve correctly, resulting in broken or non-functional article URLs
- The bug affects any RSS or Atom feed where a valid link must include query parameters
For example, given the URL:
https://www.reddit.com/media?url=https%3A%2F%2Fpreview.redd.it%2Fhg67qgm2ii6f1.png%3Fwidth%3D640%26crop%3Dsmart%26auto%3Dwebp%26s%3D8af0554e2f97a5f9d0d744c5b542071265948a52
the old regex would truncate it to:
https://www.reddit.com/media?url=https%3A%2F%2Fpreview.redd.it%2Fhg67qgm2ii6f1.png
which is not a valid, fully-qualified URL
Fix
-
Added url decoding Before running the URL regex, we now unescape the most common XML/HTML entities (
&amp;,&,",<,>) so that ampersands and other characters are restored to their literal form -
Extended the URL regex to capture query parameters