gitea icon indicating copy to clipboard operation
gitea copied to clipboard

RSS feeds for repositories owned by organizations have duplicate entries

Open prologic opened this issue 3 years ago • 9 comments

Description

The RSS feeds for repositories owned by organizations have several duplicate entries with different GUIDs, cluttering feed readers and reducing the number of unique items available in a feed. I originally ran into the bug on https://git.mills.io/yarnsocial/yarn.rss and did some testing in different contexts to try and find a pattern. The bug is definitely related to repositories owned by organizations, but the number of duplicates differs between repositories. The results of my tests are displayed below.

URL							Instance Version		Duplicates?	Notes
https://git.mills.io/yarnsocial/yarn.rss		1.17.0				Yes		Original finding
https://git.mills.io/yarnsocial/yarn.social.rss		1.17.0				Yes		Same organization, different repository
https://git.mills.io/saltyim/salty.im.rss		1.17.0				Yes		Same instance, different organization
https://git.mills.io/prologic/gonix.rss			1.17.0				No		Same instance, but repository owned by a user
https://try.gitea.io/0000000000/aaaa.rss		1.18.0+dev-350-g8bbb622bb	Yes		Demo instance, repo owned by an organization. 
https://try.gitea.io/UpYoursMicrosoft/foobar.rss	1.18.0+dev-350-g8bbb622bb	No		Demo instance, repo owned by a user

I had to use an existing repository owned by an organization on the demo instance because I was unable to create an organization myself.

Gitea Version

1.17.0, 8bbb622bb

Can you reproduce the bug on the Gitea demo site?

Yes

Log Gist

No response

Screenshots

No response

Git Version

No response

Operating System

No response

How are you running Gitea?

I am not the operator of git.mills.io and I don't have any details on how it's being run.


This bug report was filed on behalf of another. As this is related to my instance, it is run as a Docker container with a SQLite database.

Database

SQLite

prologic avatar Aug 28 '22 23:08 prologic

This fix seems related: https://github.com/go-gitea/gitea/pull/20738, but it was closed without merging for performance concerns.

yan12125 avatar Oct 02 '22 11:10 yan12125

I can confirm that I've also seen duplicate RSS entries in my own installation, but have not had the opportunity to try and troubleshoot. I had thought to just turn the feature off, but I did not see an environment to do that.

DarrenPIngram avatar Oct 04 '22 11:10 DarrenPIngram

I can verify that I also see multiple RSS entries per commit.

vasvir avatar Apr 24 '23 11:04 vasvir

It is unsolvable at the moment (the action table blocks it) unless there could be a performant solution.

wxiaoguang avatar Apr 24 '23 11:04 wxiaoguang

This is still a problem as seen on https://git.crux.nu/ports/core.rss

TimB87 avatar Mar 12 '24 17:03 TimB87

FYI, I am working on a patch that addresses this issue, and gets rid of the dups in repo feeds, without using dedup or other expensive operations.

You can find my WIP PR at https://codeberg.org/forgejo/forgejo/pulls/3598. It may or may not apply to Gitea cleanly, but the change itself is small enough to port over. Hope this helps!

algernon avatar May 02 '24 00:05 algernon

If there is no bug or design problem, Gitea could port over. Thank you.

wxiaoguang avatar May 02 '24 00:05 wxiaoguang

FYI, I rewrote the explanation of my fix to better explain what it does, and how, and why, and why I believe it is correct. There's also a test now. Initial performance testing indicates it has no noticable negative impact.

algernon avatar May 09 '24 20:05 algernon

@algernon Thank you very much for figuring out the problem.

I proposed a simpler and clearer fix: Filter out duplicate action(activity) items for a repository #30957 , feel free to cherry pick it if you like it.

ps: I am not a native English speaker, so I could only express directly and I am not sure whether the wording are fine to you. Feel free to suggest to improve the description if I missed anything.

wxiaoguang avatar May 12 '24 15:05 wxiaoguang