PatreonDownloader icon indicating copy to clipboard operation
PatreonDownloader copied to clipboard

Added post filter to avoid parsing the same posts again after restarting the parsing process

Open VladislavBar opened this issue 9 months ago • 2 comments

Basically, I had an issue when the parser would try to fetch again the same content instead of skipping the post entirely. So, here's my solution for that.

Hopefully, it will work for you as well

VladislavBar avatar Mar 19 '25 21:03 VladislavBar

Your filter implementation uses a nested loop (crawledUrls.RemoveAll(x => _ignorePosts.Any(y => y.Id == x.PostId));) which is inefficient. For every single post crawled, the filter will need to iterate every ignored post ID again, and every post downloaded is added to the list of ignored IDs. Have you considered using a HashSet<> instead for the internal implementation?

VariableVixen avatar Mar 28 '25 15:03 VariableVixen

Oh... I see. That definitely will be better than iterating over and over again. I'll change it

VladislavBar avatar Mar 31 '25 08:03 VladislavBar