PyVerse [Code Addition Request]: Pipeline for Detecting whether given PDF is malicious or not

Have you completed your first issue?

[X] I have completed my first issue

Guidelines

[X] I have read the guidelines
[X] I have the link to my latest merged PR

Latest Merged PR Link

https://github.com/UTSAVS26/PyVerse/pull/416

Project Description

I would like to contribute by developing a pipeline that, when provided with a PDF, extracts metadata, content, and other relevant features. These extracted elements are then processed and passed to a model, which predicts whether the PDF is malicious.

Additions:

Model Training notebook
Pdf feature extraction notebook
Data processing and model prediction notebook
Dataset used
Trained Model
Readme.md

Full Name

Darsh Agrawal

Participant Role

GSSOC

Oct 13 '24 20:10 DarshAgrawal14

🙌 Thank you for bringing this issue to our attention! We appreciate your input and will investigate it as soon as possible.

Feel free to join our community on Discord to discuss more!

Oct 13 '24 20:10 github-actions[bot]

✅ This issue has been closed. Thank you for your contribution! If you have any further questions or issues, feel free to join our community on Discord to discuss more!

Oct 16 '24 02:10 github-actions[bot]