Rumble 2.0
Preface
I’ve come up with an idea for an improved version of the RumbleBridge to make it more modern and robust, incorporating additional data like embed URLs, thumbnails, and durations from Rumble. However, I’m inexperienced with GitHub and its workflows, so I’m submitting this as a request rather than a direct pull request. Below is the changelog detailing the enhancements and the updated feed output. I’d appreciate any feedback or assistance in getting this integrated!
Changelog: Improvements to RumbleBridge
Changed
-
Caching Behavior:
- Changed
CACHE_TIMEOUTfrom60 * 60(1 hour, fixed) to0(no caching by default). - Added new parameter
cache_timeout(type:number, default:0) toPARAMETERS, dynamically configurable viagetCacheTimeout(). - Impact: Users can now control caching through the UI (e.g., set to
0for always fresh data).
- Changed
-
Description:
- Updated from
'Fetches the latest channel/user videos and livestreams.'to'Fetches detailed channel/user videos and livestreams from Rumble.'. - Impact: Reflects the enhanced data output.
- Updated from
Added
-
Data Sources:
- Added primary extraction of JSON-LD data using
getContents()andpreg_match()to parse<script type="application/ld+json">. - Retained HTML parsing with
getSimpleHTMLDOM()as a fallback when JSON-LD is unavailable. - Impact: More stable and modern data extraction with JSON-LD; fallback improves reliability.
- Added primary extraction of JSON-LD data using
-
Feed Item Content:
- JSON-LD (Primary Source):
- Added
embedUrlas an<iframe>incontent. - Added
descriptionas a<p>incontent. - Added
thumbnailUrltoenclosures. - Added
durationas text incontentand as seconds initunes:duration.
- Added
- HTML Fallback:
- Kept minimal extraction similar to the original, using
defaultLinkTo()for cleaner links.
- Kept minimal extraction similar to the original, using
- Impact: Richer feed items with embed URLs, thumbnails, descriptions, and durations when JSON-LD is available.
- JSON-LD (Primary Source):
-
Parameter:
- Added
cache_timeouttoPARAMETERSfor user-configurable caching. - Impact: Enhances flexibility for feed refresh rates.
- Added
Improved
-
Error Handling:
- Replaced
throw new \Exceptionwith API-compliantreturnServerError()for invalid account, type, or failed requests. - Impact: Better integration with RSS-Bridge error handling conventions.
- Replaced
-
Code Structure:
- Split logic into helper methods:
createItemFromJsonLd(): Handles JSON-LD data.createItemFromHtml(): Handles HTML fallback.parseDurationToSeconds(): ConvertsPT1H2M3Sto seconds.
- Impact: Improved readability and maintainability.
- Split logic into helper methods:
Feed Output
The feed items ($this->items) now contain the following fields, depending on the data source:
When JSON-LD is Available (Primary Source):
title: Video title fromname(e.g., "My New Video"), fallback to "Untitled".author:$account . '@rumble.com'(e.g., "[email protected]").uri: Video URL fromurl(e.g., "https://rumble.com/v12345-my-video.html").timestamp: Unix timestamp fromuploadDate(e.g.,1710278400), fallback to current time.content: HTML string with:<iframe src="[embedUrl]" frameborder="0" allowfullscreen></iframe>(e.g.,<iframe src="https://rumble.com/embed/v12345" ...>), ifembedUrlexists.<p>[description]</p>(e.g.,<p>This is my new video about Bitcoin.</p>), ifdescriptionexists.<p>Duration: [duration]</p>(e.g.,<p>Duration: PT15M30S</p>), ifdurationexists.
enclosures: Array with thumbnail URL fromthumbnailUrl(e.g.,["https://rumble.com/thumbs/v12345.jpg"]), if available.itunes:duration: Duration in seconds fromduration(e.g.,"930"for 15m30s), if available.
When JSON-LD is Unavailable (HTML Fallback):
title: Title from<h3>(e.g., "My New Video"), fallback to "Untitled".author:$account . '@rumble.com'(e.g., "[email protected]").uri: Video URL from<a href>(e.g., "https://rumble.com/v12345-my-video.html").timestamp: Unix timestamp from<time datetime>(e.g.,1710278400), if available.content: Cleaned HTML content with links toself::URI(e.g.,<a href="https://rumble.com/v12345-my-video.html">My New Video</a>).
Code Rumble Bridge 2.0
<?php
class RumbleBridge extends BridgeAbstract
{
const NAME = 'Rumble.com Bridge';
const URI = 'https://rumble.com/';
const DESCRIPTION = 'Fetches detailed channel/user videos and livestreams from Rumble.';
const MAINTAINER = 'dvikan, NotsoanoNimus';
const CACHE_TIMEOUT = 0;
const PARAMETERS = [
[
'account' => [
'name' => 'Account',
'type' => 'text',
'required' => true,
'title' => 'Name of the target account (e.g., 21UhrBitcoinPodcast)',
],
'type' => [
'name' => 'Account Type',
'type' => 'list',
'title' => 'The type of profile to create a feed from.',
'values' => [
'Channel (All)' => 'channel',
'Channel Videos' => 'channel-videos',
'Channel Livestreams' => 'channel-livestream',
'User (All)' => 'user',
],
],
'cache_timeout' => [
'name' => 'Cache Timeout (seconds)',
'type' => 'number',
'defaultValue' => 0,
'title' => 'How long to cache the feed (0 for no caching)',
],
]
];
public function collectData()
{
$account = $this->getInput('account');
$type = $this->getInput('type');
$url = self::URI;
if (!preg_match('#^[\w\-_.@]+$#', $account) || strlen($account) > 64) {
returnServerError('Invalid target account.');
}
switch ($type) {
case 'user':
$url .= "user/$account";
break;
case 'channel':
$url .= "c/$account";
break;
case 'channel-videos':
$url .= "c/$account/videos";
break;
case 'channel-livestream':
$url .= "c/$account/livestreams";
break;
default:
returnServerError('Invalid media type.');
}
$html = $this->getContents($url);
if (!$html) {
returnServerError("Failed to fetch $url");
}
$items = [];
if (preg_match('/<script.*?application\/ld\+json.*?>(.*?)<\/script>/s', $html, $matches)) {
$jsonData = json_decode($matches[1], true);
if ($jsonData) {
$videos = isset($jsonData['@graph']) ? $jsonData['@graph'] : [$jsonData];
foreach ($videos as $item) {
if (isset($item['@type']) && $item['@type'] === 'VideoObject') {
$items[] = $this->createItemFromJsonLd($item, $account);
}
}
}
}
if (empty($items)) {
$dom = $this->getSimpleHTMLDOM($url);
if ($dom) {
foreach ($dom->find('ol.thumbnail__grid div.thumbnail__grid--item') as $video) {
$items[] = $this->createItemFromHtml($video, $account);
}
} else {
returnServerError("Failed to parse HTML from $url");
}
}
$this->items = $items;
}
private function createItemFromJsonLd(array $json, string $account): array
{
$item = [
'title' => html_entity_decode($json['name'] ?? 'Untitled', ENT_QUOTES, 'UTF-8'),
'author' => $account . '@rumble.com',
'uri' => $json['url'] ?? '',
'timestamp' => (new DateTime($json['uploadDate'] ?? 'now'))->getTimestamp(),
'content' => '',
];
if (isset($json['embedUrl'])) {
$item['content'] .= "<iframe src='{$json['embedUrl']}' frameborder='0' allowfullscreen></iframe>";
}
if (isset($json['description'])) {
$item['content'] .= '<p>' . html_entity_decode($json['description'], ENT_QUOTES, 'UTF-8') . '</p>';
}
if (isset($json['thumbnailUrl'])) {
$item['enclosures'] = [$json['thumbnailUrl']];
}
if (isset($json['duration'])) {
$item['content'] .= "<p>Duration: {$json['duration']}</p>";
$item['itunes:duration'] = $this->parseDurationToSeconds($json['duration']);
}
return $item;
}
private function createItemFromHtml($video, string $account): array
{
$href = $video->find('a', 0)->href ?? '';
$item = [
'title' => $video->find('h3', 0)->plaintext ?? 'Untitled',
'author' => $account . '@rumble.com',
'content' => $this->defaultLinkTo($video->innertext, self::URI),
'uri' => self::URI . ltrim($href, '/'),
];
$time = $video->find('time', 0);
if ($time) {
$item['timestamp'] = (new DateTime($time->getAttribute('datetime')))->getTimestamp();
}
return $item;
}
private function parseDurationToSeconds(string $duration): string
{
if (preg_match('/PT(\d+H)?(\d+M)?(\d+S)?/', $duration, $matches)) {
$hours = (int) str_replace('H', '', $matches[1] ?? 0);
$minutes = (int) str_replace('M', '', $matches[2] ?? 0);
$seconds = (int) str_replace('S', '', $matches[3] ?? 0);
return (string) ($hours * 3600 + $minutes * 60 + $seconds);
}
return $duration;
}
public function getName()
{
return $this->getInput('account') ? "Rumble.com - {$this->getInput('account')}" : self::NAME;
}
public function getCacheTimeout()
{
return (int) $this->getInput('cache_timeout') ?: self::CACHE_TIMEOUT;
}
}
disabling of cache not a good idea. we want to avoid getting banned by rumble for excessive requests.
there is already a global setting which enables user-defined cache ttl on all bridges (bridge owner can enable this)
what is json-ld?
returnServerError() is a relic of the past with a misleading name and serves no purpose, and im on a mission to remove that function. directly throwing an exception is better.
dont use empty
use Json::decode
learn to use git and github workflow. i can create a PR as example:
vim bridges/RumbleBridge.php
git add bridges/RumbleBridge.php
git ci -m'fix(rumble): improve bridge'
git push
fatal: The current branch feat-rumble-v2 has no upstream branch.
To push the current branch and set the remote as upstream, use
git push --set-upstream origin feat-rumble-v2
To have this happen automatically for branches without a tracking
upstream, see 'push.autoSetupRemote' in 'git help config'.
git push --set-upstream origin feat-rumble-v2
https://github.com/RSS-Bridge/rss-bridge/pull/4487