ML-Crate
ML-Crate copied to clipboard
Gemini Generated Essays Analysis
ML-Crate Repository (Proposing new issue)
:red_circle: Project Title : Gemini Generated Essays Analysis :red_circle: Aim : The aim of this project is to analyze the essays generated by Gemini software using ML. :red_circle: Dataset : https://www.kaggle.com/datasets/mouadberqia/gemini-generated-essays :red_circle: Approach : Try to use 3-4 algorithms to implement the models and compare all the algorithms to find out the best fitted algorithm for the model by checking the accuracy scores. Also do not forget to do a exploratory data analysis before creating any model.
📍 Follow the Guidelines to Contribute in the Project :
- You need to create a separate folder named as the Project Title.
- Inside that folder, there will be four main components.
- Images - To store the required images.
- Dataset - To store the dataset or, information/source about the dataset.
- Model - To store the machine learning model you've created using the dataset.
requirements.txt- This file will contain the required packages/libraries to run the project in other machines.
- Inside the
Modelfolder, theREADME.mdfile must be filled up properly, with proper visualizations and conclusions.
:red_circle::yellow_circle: Points to Note :
- The issues will be assigned on a first come first serve basis, 1 Issue == 1 PR.
- "Issue Title" and "PR Title should be the same. Include issue number along with it.
- Follow Contributing Guidelines & Code of Conduct before start Contributing.
:white_check_mark: To be Mentioned while taking the issue :
- Full name :
- GitHub Profile Link :
- Participant ID (If not, then put NA) :
- Approach for this Project :
- What is your participant role? (Mention the Open Source Program name. Eg. HRSoC, GSSoC, GSOC etc.)
Happy Contributing 🚀
All the best. Enjoy your open source journey ahead. 😎
Full name : Jaya Kedia GitHub Profile Link : https://github.com/jayakedia10 Participant ID: NA Approach for this Project : The algorithms I have planned to use for implementing the models are Natural Language Processing (NLP) algorithms like Naive Bayes, Support Vector Machines (SVM) or Random Forest, RNNs. What is your participant role? JWoC 2024
@abhisheks008 Please assign me this issue.
Assigned under JWOC @jayakedia10
Full name: Milan Prajapati
GitHub Profile Link: GitHub_Profile
Participant ID (If not, then put NA): NA
Approach for this Project:
1. Exploratory Data Analysis (EDA): Analyze the dataset using visualizations and summary statistics to gain insights. 2. Data Preprocessing: Clean and prepare the data by tokenizing text, removing stopwords, and applying other text preprocessing techniques. 3. Model Implementation: Implement 3-4 machine learning algorithms, such as Logistic Regression, Random Forest, SVM, and Naive Bayes, for sentiment analysis. 4. Model Comparison: Evaluate the models using accuracy scores and other relevant metrics to identify the best-performing model. 5. Documentation: Document the entire process, including EDA, preprocessing steps, model implementations, comparisons, and conclusions in the README.md file.
What is your participant role? : SSoC (Social Summer of Code)
Sir, can You Please assign this project to me...?
Full name: Eshu Patel
GitHub Profile Link:](https://github.com/EshuPatel)
Participant ID: NA
Approach for this Project:
- Performing EDA to analyze the dataset using visualizations and summary statistics to gain insights.
- Data being wrangled, and noisy data being removed.
- ML algos implemented for implementing NLP.
- Checking the model against accuracy scores and other metrics.
- Documentation of the entire project summary.
Participant role: SSoC (Social Summer of Code)
Sir, can You Please assign this project to me...?
@milanprajapati571 @EshuPatel I need a brief approach for this problem statement with a planning of implementing 7-8 models.
Name: Harsh Raj
GitHub Profile: https://github.com/HarshRaj29004
Participant ID: NA
Approach: I will perform preprocessing on dataset and, will check for grammatical error and readibility of text. Predict quality of text using random forest, Gradient Boosting Machines, Neural Networks.
What is your participant role? SSOC'24
I need a brief approach for this problem statement with a planning of implementing 7-8 models.
Sure sir, Brief approach is given below:
- After cleaning, perform EDA on the dataset to gain more insights about data.
- After preprocessing the data, we will implement various sentiment analysis models that incudes Naive-Bayes approach, Multinomial Naive Bayes, CNN, Random Forest, Logistic Regression, SVM, Text blob, Vader and Spacy.
- And then will perform a comparison depicting which model will provide most accurate results.
Assigned @EshuPatel