PyVerse icon indicating copy to clipboard operation
PyVerse copied to clipboard

[Code Addition Request]: Pipeline for Detecting whether given PDF is malicious or not

Open DarshAgrawal14 opened this issue 1 year ago • 1 comments

Have you completed your first issue?

  • [X] I have completed my first issue

Guidelines

  • [X] I have read the guidelines
  • [X] I have the link to my latest merged PR

Latest Merged PR Link

https://github.com/UTSAVS26/PyVerse/pull/416

Project Description

I would like to contribute by developing a pipeline that, when provided with a PDF, extracts metadata, content, and other relevant features. These extracted elements are then processed and passed to a model, which predicts whether the PDF is malicious.

Additions:

  • Model Training notebook
  • Pdf feature extraction notebook
  • Data processing and model prediction notebook
  • Dataset used
  • Trained Model
  • Readme.md

Full Name

Darsh Agrawal

Participant Role

GSSOC

DarshAgrawal14 avatar Oct 13 '24 20:10 DarshAgrawal14

🙌 Thank you for bringing this issue to our attention! We appreciate your input and will investigate it as soon as possible.

Feel free to join our community on Discord to discuss more!

github-actions[bot] avatar Oct 13 '24 20:10 github-actions[bot]

✅ This issue has been closed. Thank you for your contribution! If you have any further questions or issues, feel free to join our community on Discord to discuss more!

github-actions[bot] avatar Oct 16 '24 02:10 github-actions[bot]