IntelOwl icon indicating copy to clipboard operation
IntelOwl copied to clipboard

[Analyzer] Zippy

Open mlodic opened this issue 1 year ago • 8 comments

Name

Zippy

Link

https://github.com/thinkst/zippy

Type of analyzer

file analyzer

Why should we use it

Detect AI-generated text [relatively] quickly via compression ratios

Possible implementation

tool written in python, we could integrate it as a library

mlodic avatar Oct 19 '23 12:10 mlodic

I want to work on this, can I take up this issue ??

GetRohitansh avatar Oct 31 '23 17:10 GetRohitansh

yep

mlodic avatar Oct 31 '23 17:10 mlodic

hello @mlodic , Since the zippy repo provides us a script that is used as a cli command and is not a pre-built library, should we directly implement the zippy script, wrap it up with desired functionalities, into a single analyzer file and use it as a new analyzer?

Also I'd like to work on this issue pls :)

g4ze avatar Jan 08 '24 14:01 g4ze

I think you could install it in the same way that they suggest to install it here and then call the functionality expressed here in the analyzer file (leverage the Zippy class).

Even if the best thing would be to help them to create a packet in pypi and leverage the pypi installation then :P

mlodic avatar Jan 08 '24 15:01 mlodic

okay so i did some things here and there:

  1. installed zippy via cloning and executing the commands in the guide.
  2. I first tried using zippy by importing it with some random values, and it worked just fine image
  3. I took some references from the other implemented file analyzers and the contribution doc and coded an analyzer with that info.
  4. Now that I'm trying to integrate it using the admin panel, I have to add a new py module, it throws an error. This probably has something to do with the import zippy I'm doing. Can't exactly find a lot of information about how to proceed in such cases. error thrown
  5. I did take some refernce from other analyzers but couldnt exactly figure out a good suitable way for implementing relative imports in Analyzers.
  6. This is the code I wrote to implement the analyzer:
# This file is a part of IntelOwl https://github.com/intelowlproject/IntelOwl
# See the file 'LICENSE' for copying permission.

from zippy import EnsembledZippy
from api_app.analyzers_manager.classes import FileAnalyzer
from api_app.analyzers_manager.exceptions import (
    AnalyzerRunException,
)
from api_app.analyzers_manager.observable_analyzers.yaraify import YARAify
from tests.mock_utils import patch

class ZippyAnalyser(FileAnalyzer):
    """
    Tells if a file is written by HUMAN or AI
    """
    def run(self):
        binary_data=self.read_file_bytes()
        text_data = binary_data.decode('utf-8')
        try:
            response=(EnsembledZippy().run_on_text_chunked(text_data))
        except Exception as e:
            print("Zippy not imported correctly")
            raise AnalyzerRunException
        print(response)
        return response.json()

    @classmethod
    def _monkeypatch_zippy(cls):
        patches = [
            patch(
                'zippy.EnsembledZippy.run_on_text_chunked',
                return_value={'mocked_response': 'AI 0.62739394'}
            ),
        ]
        return super()._monkeypatch(patches=patches)

Can you guide me to some reference as to why and how can I solve the error(4) in question.

g4ze avatar Jan 08 '24 19:01 g4ze

hey, thanks for your efforts :) please open a PR and set it as "draft". In this way, we can help you to solve such kind of problems better, without having to copy/pasting things here which complicates things further. ty

mlodic avatar Jan 09 '24 08:01 mlodic

Oops, I'm sorry, made a mistake in there, opening a new one right away!

g4ze avatar Jan 09 '24 12:01 g4ze

pypi installation now available through issue at installation

g4ze avatar Jan 29 '24 19:01 g4ze