NLP-Projects icon indicating copy to clipboard operation
NLP-Projects copied to clipboard

This repository contains a collection of Natural Language Processing (NLP) projects

NLP Projects Repository

Welcome to the NLP Projects repository! This repository contains a collection of Natural Language Processing (NLP) projects developed by [Your Name or Organization].

Table of Contents

  1. News Classification
  2. Auto Correct
  3. Measure Similarity
  4. Text Summarization
  5. Email Spam Detection
  6. Resume Classification
  7. Knowledge Graph

1. News Classification

  • Description: This project aims to classify news articles into different categories using NLP techniques. It involves text preprocessing, feature extraction, and machine learning classification algorithms.

  • Files:

    • fars_news_v1.2.ipynb: Jupyter notebook containing the code for news classification.
    • test.txt: Sample test data for the classification.

2. Auto Correct

  • Description: Implementation of an auto-correct system using NLP algorithms. The system corrects spelling mistakes in text input by suggesting the most probable corrections based on context and language models.

  • Files:

    • main.py: Main script for the auto-correct system.
    • model/: Directory containing modules for edit distance calculation, Jaccard similarity, and pre-processing.
    • words_en.csv: English word dataset.
    • words_fa.csv: Persian (Farsi) word dataset.

3. Measure Similarity

  • Description: A project to measure the similarity between two texts. It involves calculating various similarity metrics such as cosine similarity, Jaccard similarity, or edit distance.

  • Files:

    • main.ipynb: Jupyter notebook containing code for measuring text similarity.
    • data set/: Directory containing sample text files for similarity measurement.

4. Text Summarization

  • Description: Implementation of text summarization techniques using NLP. The project aims to generate concise summaries of large text documents or articles while preserving the key information.

  • Files:

    • main.ipynb: Jupyter notebook with code for text summarization.

5. Email Spam Detection

  • Description: Detecting spam emails using various machine learning algorithms and NLP features. The project involves text preprocessing, feature extraction, model training, and evaluation.

  • Files:

    • src/: Directory containing scripts for pre-processing, feature extraction, model training, and evaluation.
    • README.md: Details about the project and its implementation.

6. Resume Classification

  • Description: Classifying resumes into different categories based on their content. It involves extracting relevant information from resumes and using machine learning algorithms for classification.

  • Files:

    • resume_classification.ipynb: Jupyter notebook for resume classification.
    • resume_dataset.csv: Dataset containing resume samples.

7. Knowledge Graph

  • Description: Building a knowledge graph from text data. The project involves extracting entities and relationships from unstructured text and representing them in a structured graph format.

  • Files:

    • example/: Directory containing example data and scripts for building the knowledge graph.
    • src/: Directory containing scripts for extracting details, processing data, and building the knowledge graph.
    • README.md: Information about the project and how to use it.

Feel free to explore each project folder for more details and instructions on how to run the code.