Guttenberg icon indicating copy to clipboard operation
Guttenberg copied to clipboard

Check network-wide

Open wizzwizz4 opened this issue 8 years ago • 7 comments

Stack Overflow is the biggest site on the SE network, but the other sites have their fair share of plagiarism. Adding these should approximately double the load on the bot, so this might not be feasible at the moment.

wizzwizz4 avatar Feb 25 '17 17:02 wizzwizz4

At the moment, Guttenberg only checks, if the post (the "target") is copied from answers to one of the linked/related questions. Since this doesn't work across SE sites and even doesn't find many plagiarized posts on SO, we need to find an other source like search-engines.

We'll discuss this in our next room meeting: SOBotics/SOBotics.github.io#9

If you have any idea, where we can find possibly related sources/posts to find the original post of a "target", feel free to add it here (or in the linked issue for the room meeting).


The api-quota of the bot would be a problem, if we had a way to access possibly related posts via the API. The CPU and RAM load won't be a problem, since the bot will be moved from my Raspberry Pi to a VPS in 6-8 weeks.

FelixSFD avatar Feb 25 '17 17:02 FelixSFD

Expanding to other sites on the network is one of our plans too. https://github.com/SOBotics/SOBotics.github.io/issues/4

Gut at the moment is almost production ready for other sites. There has been many flags moderator flags from it's reports. (There is a strong chance that it's accuracy would be better on other sites with text only answers). But there are a couple of small issues that we face:

  1. The ChatExchange library that we are using for the bot is not compatible with Stack Exchange Chat.
  2. We are not aware of the number of posts per minute on other sites. The bot uses up 50% of the API calls already.

The initial plan for expansion is to first go ahead with a few other similar sites, like Ask Ubuntu (need to speak to Thomas Ward) and Unix & Linux (need to speak to terdon) before going on to the other sites.

Bhargav-Rao avatar Feb 26 '17 12:02 Bhargav-Rao

@Bhargav-Rao Is that Tuna's Java CE? The original CE (Python) is compatible with any chat server.

ArtOfCode- avatar Feb 26 '17 12:02 ArtOfCode-

Yep, @ArtOfCode-. Tuna's lib was original written only for SOCVFinder, which was intended to work only on Stack Overflow. In the latest version, there has been some progress towards making it compatible on all the 3 hosts, but, it does not still support SE and MSE chat.

Bhargav-Rao avatar Feb 26 '17 13:02 Bhargav-Rao

However this is only related to chat room, it does not directly influence the implementation. For now I see it as minor limit (room has to be on SO) and I bet Tuna can solve it if the need grows to have rooms on other chat's.

jdd-software avatar Feb 27 '17 10:02 jdd-software

I'm sure the problem just relies around the authentication, but I haven't had a chance to look into it yet. I'm betting this is a simple fix. I will raise the priority of this on the todo list :).

Tunaki avatar Feb 27 '17 10:02 Tunaki

@Tunaki has fixed the issue. I've setup Guttenberg for unix.se on my local machine.

http://chat.stackexchange.com/transcript/message/35733717#35733717

Looks like the unix site gets a very low amount of traffic, and most of the API calls made are only to check if a new answer has arrived. After running it for 2 hrs, there were 20 answers. So I think, if we run the check every 20 mins or so, we should be fine.

Bhargav-Rao avatar Feb 28 '17 14:02 Bhargav-Rao