nitter icon indicating copy to clipboard operation
nitter copied to clipboard

GUID in RSS feed differs between public instances, causing duplicates

Open votemp opened this issue 3 years ago • 11 comments

To avoid duplicates when switching between public instances (for example when using the 3rd party load balancing services in the wiki) it would perhaps be better if they all had the same GUID for the same tweet. ~Using the original twitter.com URL would be a reasonable choice.~

votemp avatar Oct 02 '21 17:10 votemp

Using the original twitter.com URL would be a reasonable choice.

I like this idea and thought of almost the exact same approach because every single tweet is guaranteed to have a unique ID made up of a pretty long string of numbers. I'm thinking that GUIDs should be calculated based on that ID alone, which located in the URL between status/ and any tracking parameters that may appear after it beginning with a question mark (?). For example, the ID of this tweet is 1443946032350613506.

acarasimon96 avatar Oct 02 '21 19:10 acarasimon96

Unlike what I previously suggested also I suggest using only the tweet ID for GUID. The main reason for not using the full original twitter.com URL is to avoid duplicates when the user changes the account name.

votemp avatar Oct 03 '21 14:10 votemp

https://twitter.com/i/status/[tweet_id] would consistently work even if the account name changes.

0x7D2B avatar Oct 22 '21 13:10 0x7D2B

This is easy, but the main problem is any change to the GUID format will cause everyone's RSS clients to suddenly think there are 20 new tweets. I can't think of any solution to this other than only applying it to tweets newer than a certain timestamp for transition period, but I don't know if that's even a good idea. Any feedback is very welcome, I know people complained a lot last time I changed the GUIDs.

zedeus avatar Dec 20 '21 00:12 zedeus

Looking into tweet IDs , it looks like they're roughly sortable. So another option would be to determine a tweet ID a significant period in the future, and switch over to just using the tweet ID for tweet IDs greater than that.

~~Or just make it a configuration option, and let instances decide what representation they want, I guess~~ of course this won't solve the issue of inconsistent GUIDs between instances...

vijfhoek avatar Dec 27 '21 02:12 vijfhoek

So another option would be to determine a tweet ID a significant period in the future, and switch over to just using the tweet ID for tweet IDs greater than that.

A soft transition maybe? Start doing this, then at some point in the future switch to doing Twitter URLs/IDs exclusively.

0x7D2B avatar Jan 04 '22 11:01 0x7D2B

@zedeus here's how you could switch over without disrupting existing feeds:

  1. If a param ?global=1 is appended to the rss URL it will use a unified GUID (like the original Twitter link).
  2. Update the RSS link in the Nitter software to add this param to the default displayed at the top of the UI.
  3. Retain the original behaviour at the existing link.

chr15m avatar Feb 14 '22 13:02 chr15m

With nitter.net becoming increasingly popular and apparently struggling under heavy load, solving this issue seems important to me to make switching to other instances as easy as possible.

If we switch the GUID to the native Twitter ID, I think there would be little reason to change the GUID format again, which would maybe justify the inconvenience of users getting duplicate entries in their feeds again.

haansn08 avatar Jul 23 '22 12:07 haansn08

Just came across this issue, but in case it helps anyone: I wrote nitter-rss-proxy a while back to round-robin between public instances and rewrite GUIDs.

derat avatar Nov 14 '22 20:11 derat

Just came across this issue, but in case it helps anyone: I wrote nitter-rss-proxy a while back to round-robin between public instances and rewrite GUIDs.

This is awesome, will you support dockers image? thanks

somedevreally avatar Nov 14 '22 21:11 somedevreally

This is awesome, will you support dockers image? thanks

My Docker knowledge is rusty/nonexistent at this point, but if you file an issue at https://github.com/derat/nitter-rss-proxy/issues, I'll try to look into it at some point. The proxy requires some runtime configuration (e.g. the list of public instances to talk to).

derat avatar Nov 14 '22 21:11 derat