rss-bridge Project goals and their prioritization

Project goals and their prioritization

Open dvikan opened this issue 2 years ago • 10 comments

The goal of this issue is to find out where we should place our efforts to maximally improve rss-bridge and its long-term survival and prosperity.

This is an issue in which we discuss midterm/longterm goals for the project and prioritize them.

An initial list from my memory:

Find a solution to anti-bot systems (e.g. cloudflare captcha/DataDome)
Bridge discovery from url (e.g. resolving http://twitter.com/foo to the correct bridge) (https://github.com/RSS-Bridge/rss-bridge/issues/2361)
Improve html sanitization (https://github.com/RSS-Bridge/rss-bridge/issues/2540)
Introduce a new document root to prevent direct access to files in the ./cache folder
Improved automatic detection of broken bridges
Attract more bridge maintainers so that bridges get repaired faster
make rss-bridge usable as a software library.
Improve DX (improve the developer experience of making and maintaining bridges)
Add feature to reconfigure an existing feed
Add some protections to instances with high traffic volume (to prevent being banned, throttled etc., internal rate limiter)

Inputs from the community is very much appreciated here.

Feel free to suggest more goals and better goal ordering.

May 14 '22 13:05 dvikan

Here is my preferred ordering:

(Same as 5 from first comment) Ideally, bridges would be able to define a list of tests themselves and exampleValue would be used as a last resort (Similar to #2280). For example, each of the language options in the Wikipedia Bridge tests a completely separate page with its own classes and ids, but only one is tested. I also think testing functionality should be part of the app as an action, rather than a separate script.
Improved documentation. The new site is a good start, but I think the theming and layout could be improved as well. There is also this project.
Organized global parameters. We currently have 2 global parameters: _noproxy and _cache_timeout, but there is support for others (show_enclosures, limit). There should be a more standard way of adding these.
(Same as 2 from first comment). Some bridges may only be possible to detect by actually getting the page, though I guess that could be implemented within detectParameters (examples: CachetBridge, WordPressBridge, FDroidRepoBridge).
(Same as 4.) This would also need some way to serve content
Bridge categories by language or feed content

I am fine with (1) and (3) from your list but not too interested in working on them.

May 14 '22 14:05 yamanq

One more goal: make rss-bridge usable as a software library. This means being able to install rss-bridge as a dependency in other software so that they can reuse some of rss-bridge features.

Jun 05 '22 17:06 dvikan

One more goal: we need better stacktraces for easier coding and debugging. This touches upon logging and error handling. E.g. when the env is dev (or debug is enabled) we should spit out the entire stacktrace plus some additional info. I'm thinking we should log errors (and their stacktraces) regardless of the env or debug mode.

Jul 07 '22 18:07 dvikan

I really like this project and I want to see it be successful. I've sort of set it and forgot about it until now. The Twitter changes have made me take notice. Plus the RSS reader I use has stopped working.

I think there are a few problems issues with this project:

It relies on old-school scrapping.
It's written in PHP.
More and more walled gardens.
There's really good competition that just works ™. (e.g. https://rss.app/)

With the rise of Cloudflare and SPA using JavaScript, this is less and less useful. I believe in order to support the more popular bridges you'll need to scrap with a full-blown browser. (e.g. puppeteer, Selenium, etc.)

Also not a bad idea to use or support something APIs that do render and return the HTML:

https://www.scraperapi.com
Phantombuster

I tired and ran into errors.

I'm not here to shit on PHP. I think it's fine, however, it's not one of the most popular languages that people know really well. I don't know any PHP.

I think if the bridges were "No-Code" and abstracted away there might be more traction on supporting them. Or if they could be written in python or JS/TS which are the two most popular languages in the world right now. I would be happy to support one or two if it was in one of these languages.

~Facebook~ Meta is blocking Facebook groups and Instagram more and more. This will continue. Twitter is cutting off everyone and making things more restricted and walled off. The only way to circumvent is #1.

The bridges randomly stop working too much and I'm at the point where I might pay for rss.app. I want something that is more rock solid.

Anyways, this is my 2 cents.

Feb 04 '23 19:02 stevenirby

I believe in order to support the more popular bridges you'll need to scrap with a full-blown browser. (e.g. puppeteer, Selenium, etc.)

As other option - common chromium browser with userscript. Like I did in my InstagramBridge modifications that mentioned in https://github.com/RSS-Bridge/rss-bridge/issues/3128 and haven't prepared for publishing to main RSS-Bridge repo. But in general this bridge somehow works in a public instance.

I'm not here to shit on PHP. I think it's fine, however, it's not one of the most popular languages that people know really well. I don't know any PHP.

PHP is still widely used for web applications. So php is popular language. This should be enough.

The bridges randomly stop working too much and I'm at the point where I might pay for rss.app. I want something that is more rock solid.

Instead of paying for rss.app, I suggest you to make an agreement with @dvikan, where you pay him money and he provides RSS-Bridge instance that works as good as rss.app for you. For example my InstagramBridge modifications and some merged PRs are based on payed customization of InstagramBridge, that I did for money.

Feb 05 '23 05:02 em92

Thanks @dvikan for reaching out!

I only use my instance to create some personal feeds for simple(r) services and it works. And because there is not so much traffic, the instance is running public.

Regarding the topic, in short: For me it's essential to run maintained code and that the software fits to my setup (Nginx + PHP).

Currently I'm happy, but I don't care so much about the availability of my service. (= If I I couldn't host it by myself anymore, I would miss only ~10 feeds, of 500 …)

Maybe u should think about this (more underlying) question: Will/should RSS-Bridge be a industrial-grade software, or (more, like it IMHO is) a personal software (for easy + secure) self-hosting?

BR + 👍

Jul 07 '23 10:07 sokai

Upvote for these:

Find a solution to anti-bot systems (e.g. cloudflare captcha/DataDome)
Bridge discovery from url (e.g. resolving http://twitter.com/foo to the correct bridge) (https://github.com/RSS-Bridge/rss-bridge/issues/2361)
Attract more bridge maintainers so that bridges get repaired faster
Add feature to reconfigure an existing feed
Add some protections to instances with high traffic volume (to prevent being banned, throttled etc., internal rate limiter)

Re: Improve DX, any specifics here? I see there is a devcontainer and I find writing new Bridges pretty straight forward. Do you mean additional helper functions, etc?

Jul 11 '23 19:07 jcgoette

With the rise of Cloudflare and SPA using JavaScript, this is less and less useful. I believe in order to support the more popular bridges you'll need to scrap with a full-blown browser. (e.g. puppeteer, Selenium, etc.)

please see #3970

Feb 09 '24 19:02 hleskien

One thing I did in a local branch for DX is I changed the caching behavior in debug mode: I want to cache remote resources while I avoid caching the locally generated feeds. The current behavior which disables the cache entirely can make things a bit slow and runs the risk of running into rate limits.

Feb 16 '24 21:02 mdemoss

Another thing might be: clarify how we would like bridges to fail. For example, the FB bridge will return items even if the URL is missing. You could take the position that the bridge ought to try to show you everything it can even if it's incomplete, or that it ought to only return complete items. I favor only returning complete items because a missing URL causes my feed reader to duplicate the items.

Feb 17 '24 17:02 mdemoss

rss-bridge rss-bridge copied to clipboard

Project goals and their prioritization

rss-bridge
rss-bridge copied to clipboard