gpt-2-output-dataset
gpt-2-output-dataset copied to clipboard
Different detection result on localhost and the server
Tested with second sample of ChatGPT and the detection result is not same with server.
The test result of https://openai-openai-detector.hf.space/

Test result with roberta-base model on localhost

Test result with roberta-large model on localhost

@SnoopyDevelops Did you ever make progress on this? Seeing the same issues here.
Unfortunately, no
I can also confirm this as well, where the server result is different than both the -base and -large models.
Was anyone able to get consistent results between localhost and the website? Tried both models and still get different results.
The trick to getting the same results is apparently to use the same Python dependencies. The model that is being used on HuggingSpace really is the same -base that we've been downloading and using.
I've written up full instructions in this Ask Ubuntu answer. After using the HuggingSpace Dockerfile as a guide, my local version gives the same results as https://openai-openai-detector.hf.space/. Thanks to @CoconutMacaroon for confirming this.
Note that there are currently a few known issues with the HuggingFace OpenAI version:
- URL encoding errors cause it to fail to detect all all on many inputs
- It will fail to detect at all if the input is too long (16383 characters, including the domain)
- It will not run in the iframe from https://huggingface.co/spaces/openai/openai-detector
The first two of these, at least, are fixed in a change to detector/index.html. Instead of:
req.open('GET', window.location.href + '?' + textbox.value, true);
Use:
const maxCharacters = 16300;
req.open('GET', `?${encodeURIComponent(textbox.value)}`.slice(0,maxCharacters), true);
Thanks to @makyen (Stack Overflow Mod) for assistance with this.
I have a Space with these changes that you can use as a basis for your local copy (using the Dockerfile and cloning with git per the Ask Ubuntu answer mentioned above).
Keep in mind that, just as the OpenAI version, you won't be able to use it from that Space URL. You'll need to use the top level URL instead. Again, it's mainly for you to clone locally (or as a new Space if desired); it's going to be really slow at times on the free CPU tier.