DeltaBot icon indicating copy to clipboard operation
DeltaBot copied to clipboard

Comment history is skipping comments

Open PixelOrange opened this issue 11 years ago • 10 comments

DeltaBot is consistently missing deltas. I believe this to be a result of the comment history feature. I have temporarily added a feature that will automatically clear the history and perform a full scan. Once a better history has been devised, we will need to remove the auto-clear. It can be found in the go(self): function.

PixelOrange avatar Jan 28 '14 18:01 PixelOrange

First off, I'd rename the before attribute to something more informative like comment_history or scanned_comments and rename vars like before_id similarly. With the current name it took me a while to figure out exactly what it was.

Next, we need to figure out why this is happening. Here are some questions:

  • Do the missed Deltas have anything in common? For instance: were they edited in after the comment was posted?
  • When did this start happening?

chrisuehlinger avatar Jan 29 '14 06:01 chrisuehlinger

Go ahead and name it whatever you want. You can blame alexames for the poor variable names there :D haha.

  1. Some missed deltas don't register because of encoding but I haven't seen that specific error in a while. Ones that are edited in are never going to work correctly, so that's not an issue.
  2. It's always happened, but it's been happening with increasing frequency as we get more and more posters. The way that reddit stores comments is weird. The variable that stores them seems to shift somehow. I can't explain it but the easiest way to replicate it is to let the bot run so that it gets a variable in the prev_id.txt and then disable the bot until >1000 comments have been posted. When you turn it back on, it will get confused because it won't be able to tell if the last ID is more recent than the new comments.

PixelOrange avatar Jan 29 '14 06:01 PixelOrange

Hmm, I'm trying to wrap my head around this. But first, how many comments does CMV get per day? and how long does it usually take to get to 1000 comments?

chrisuehlinger avatar Jan 29 '14 06:01 chrisuehlinger

I'm not entirely sure how many comments per day. It doesn't appear to track that on the traffic page. We do get 60,000 pageviews and 20,000 uniques per day. If you go by the 1/10/90 rule, that means we're seeing maybe 2,000 comments in 24 hours? It's probably not that high though.

PixelOrange avatar Jan 29 '14 08:01 PixelOrange

Let's go with 2,000 to be cautious. That means that the only way the comment queue should go over 1,000 would be if DeltaBot is down for 12+ hours. Does DeltaBot ever have long periods of downtime?

For that matter, where is DeltaBot running and how often does someone check on it?

chrisuehlinger avatar Jan 29 '14 08:01 chrisuehlinger

DeltaBot runs on my PC. I check on it any time I'm on my PC or whenever I get modmail that something is freaking out.

It's been down for large gaps recently but only because there were a few bugs. I ironed them out and it's been stable since then. Argument mismatch stuff.

Other than this week, it's been super stable (I don't think there has been a crash or any downtime in the 2 months prior) but it was still missing deltas. We would then send an "add" command and it would pick it up just fine.

PixelOrange avatar Jan 29 '14 08:01 PixelOrange

Ah, the argument mismatches were due to things I did. I still need to write tests for about half of the code. From now on I'll keep my changes in a different branch until they've been tested.

One last thing, do you ever find that you need to use the "force add" command? Any time someone needs to do that, they should submit a detailed issue, since that means something's wrong with the scan_comments code. The better the details, the better our tests will be and the more bugs we'll catch.

chrisuehlinger avatar Jan 29 '14 08:01 chrisuehlinger

force add is usually only needed when someone fails at posting deltas. I don't think we've needed to use it in a while for anything other than that.

PixelOrange avatar Jan 29 '14 08:01 PixelOrange

Alright cool.

Let's keep this issue open until we figure out what the real problem is.

chrisuehlinger avatar Jan 29 '14 08:01 chrisuehlinger

Update for #58 - Fairly certain this is still an issue, but I haven't been able to find a better way to handle it.

The issue, as far as I can tell, is that the way reddit handles their comments is with a short string of numbers and letters. Inside the same submission, 2b2b2b would be stored later than 1a1a1a but in two separate submissions, that's not necessarily the case. submission 3c3c3c could have a 2b2b2b and submission 4d4d4d would have 1a1a1a. The submission is later, but the comment is "earlier" so the bot skips the comment thinking it's old.

PixelOrange avatar Apr 17 '14 21:04 PixelOrange