Slogger
Slogger copied to clipboard
Any way to prevent duplicate entries?
Is there any way to prevent duplicate entries if running slogger more than once a day?
Not with all plugins, but the -s
argument will only run back to the last time Slogger recorded a run.
-Brett
On Sep 6, 2013, at 3:41 PM, Nick Kalkounis [email protected] wrote:
Is there any way to prevent duplicate entries if running slogger more than once a day?
— Reply to this email directly or view it on GitHub.
Is this still an issue?
something odd for me, twitter seems to ingest the same item 20-24 times. This happens when it process a retweet that contains an image. Thoughts?
Somewhere in the code base I saw mention of something to clean up duplicates. Not sure that got finished though.
Correct, never finished to the point where it's safe for public
consumption. There's a slogger -u X
method that will remove entries
created since the timestamp of the run specified. It will also remove
manual entries, though, and is really designed for testing purposes.
The --dedup
switch calls the dedup function in dayone.rb. This is for
cleanup, not prevention. If passed "true" for the similar variable, it:
- looks for orphaned image files (with no matching .doentry)
- identical timestamps
- empty data['Entry Text'] without matching photo
- then runs a Levenshtein distance comparison between the text of "similar" entries to determine if they're slight variants of each other, deleting the one with shorter text if they're within a threshold.
If passed false or null, it does a hashed comparison of each file, looking for exact duplicates.
Deletion isn't destructive, it moves entries out to ~/Desktop/DayOneDuplicates for review.
If anyone ever wanted to help polish the function up, I'm sure it would be of use to more than just developers when plugins go awry.
-Brett
On 25 Jan 2015, at 13:15, Martin Cleaver wrote:
Somewhere in the code base I saw mention of something to clean up duplicates. Not sure that got finished though.
Reply to this email directly or view it on GitHub: https://github.com/ttscoff/Slogger/issues/203#issuecomment-71386934
Ah, I too am seeing duplicates with Twitter now. However, it is not just entries that have embedded pictures (one was, the other just had a link to flipboard).. I see the repeat 11 times, 13 times. Annoying.
I had wondered it was related to the retries mechanism, but I see the default is 3, so I assume not https://github.com/ttscoff/Slogger/blob/e54ab786f30e9e9d9217c51f0ef2b6544569a6ed/slogger.rb#L385
I have't figured this out yet, but I'm getting the Twitter duplicates with Favorites containing images as well. I'll be looking into it as soon as I can.