Master Machine learning

Issues Pull Requests Forks Stars

Description

Machine learning technique to analysis data that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention. ### Importance of Machine Learning Machine learning is important because it gives enterprises a view of trends in customer behavior and business operational patterns, as well as supports the development of new products. Many of today's leading companies, such as Facebook, Google and Uber, make machine learning a central part of their operations. Machine learning has become a significant competitive differentiator for many companies.

🌱Pre-requisites

Python IDE : Install it by using this link python.org
If you are new to python programming and want to have a fair knowledge before you start working on it, you can learn it in a simplified way through this website

Topics

Extracting Data

Extraction is a general term for methods of constructing combinations of the variables to get around these problems while still describing the data with sufficient accuracy

Web scrapping - Library used :->> Beautiful Soup , Which extract the data from web pages.

Visualization

Data visualization is the discipline of trying to understand data by placing it in a visual context so that patterns, trends and correlations that might not otherwise be detected can be exposed. Python offers multiple great graphing libraries that come packed with lots of different features.

Different types of libraries used to manipulate data in form of type of graphs and graphical representation :->> Seaborn , pandas , matplotlib etc.

Feature selection (Variable Selection)

the process of selecting a subset of relevant features for use in model.Having irrelevant features in your data can decrease the accuracy of the models and make your model learn based on irrelevant features.

Library used for feature selection commonly :->> scikit-learn
Link - https://machinelearningmastery.com/feature-selection-with-real-and-categorical-data/

Basic concepts of statistic

A).Understand the Type of Analytics

Descriptive Analytics tells us what happened in the past and helps a business understand how it is performing by providing context to help stakeholders interpret information.
Diagnostic Analytics takes descriptive data a step further and helps you understand why something happened in the past.
Predictive Analytics predicts what is most likely to happen in the future and provides companies with actionable insights based on the information.
Prescriptive Analytics provides recommendations regarding actions that will take advantage of the predictions and guide the possible actions toward a solution

B). Probability

Conditional Probability
Independent Events
Mutually Exclusive Events
Bayes’ Theorem

C). Central Tendency

Mean
Mode
varience
Skewness
Kurtosis:
Standard Deviation

D). Variability

Range: The difference between the highest and lowest value in the dataset.
Percentiles — A measure that indicates the value below which a given percentage of observations in a group of observations falls.
Quantiles— Values that divide the number of data points into four more or less equal parts, or quarters.
Interquartile Range (IQR)— A measure of statistical dispersion and variability based on dividing a data set into quartiles. IQR = Q3 − Q1
Variance: The average squared difference of the values from the mean to measure how spread out a set of data is relative to mean.

E). Relationship Between Variables

Causality: Relationship between two events where one event is affected by the other.
Covariance: A quantitative measure of the joint variability between two or more variables.
Correlation: Measure the relationship between two variables and ranges from -1 to 1, the normalized version of covariance.

F). Probability Distribution

Probability Mass Function (PMF): A function that gives the probability that a discrete random variable is exactly equal to some value.
Probability Density Function (PDF): A function for continuous data where the value at any given sample can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample.
Cumulative Density Function (CDF): A function that gives the probability that a random variable is less than or equal to a certain value.

G). Hypothesis Testing and Statistical Significance

Null and Alternative Hypothesis
Interpretation
Z-Test
T-Test
ANOVA (Analysis of Variance)
Chi-Square Test

H). Regression

Linear Regression ** Assumptions of Linear Regression

    - Linear Relationship
    - Multivariate Normality
    - No or Little Multicollinearity
    - No or Little Autocorrelation
    - Homoscedasticity

Multiple Linear Regression

Data Science

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains.

Why is data science important?

In business, the goal of data science is to provide intelligence about consumers and campaigns and help companies create strong plans to engage their audience and sell their products.

Data scientists must rely on creative insights using big data, the large amounts of information collected through various collection processes, like data mining. On an even more fundamental level, big data analytics can help brands understand the customers who ultimately help determine the long-term success of a business or initiative. In addition to targeting the right audience, data science can be used to help companies control the stories of their brands. Because big data is a rapidly growing field, there are constantly new tools available, and those tools need experts who can quickly learn their applications. Data scientists can help companies create a business plan to achieve goals based on research and not just intuition.
Data science plays a very important role in security and fraud detection, because the massive amounts of information allow for drilling down to find slight irregularities in data that can expose weaknesses in security systems.It is a driving force between highly specialized user experiences created through personalization and customization. The analysis can be used to make customers feel seen and understood by a company.

What are the six major areas of data science?

The six major areas of data science include the following:

Multidisciplinary investigations. Considering large, complex systems with interconnected pieces, data scientists use varying methods to collect large amounts of data.
Models and methods for data. Data scientists need to rely on experience and intuition to decide which methods will work best for modeling their data, and they need to adjust those methods continuously to hone in on the insights they seek.
Pedagogy. It is up to data scientists to work with companies and clients to determine the best ideologies to apply while collecting and analyzing information about their customers and products.
Computing with data. The biggest thing that all data science projects have in common is the necessity to use tools and software to analyze the involved algorithms and statistics, because the size of the pool of information they are working with is so massive.
Theory. Data science theory is an evolving and sophisticated professional arena with countless applications.
Tool evaluation. There are many tools available for data scientists to use to manipulate and study huge quantities of data, and it's important to always evaluate their effectiveness and keep trying new ones as they become available.

summary

useful urls

https://www.kdnuggets.com/2020/06/8-basic-statistics-concepts.html
https://www.coursera.org/learn/machine-learning-with-python
https://www.w3schools.com/python/python_ml_getting_started.asp
https://www.freecodecamp.org/learn/machine-learning-with-python/
https://www.greatlearning.in/great-lakes-pgpdsba?&utm_source=Google&utm_medium=Search&utm_campaign=6Cities_Exact_Data_Science_Search_New_DS&adgroup_id=101317851589&campaign_id=10174480218&Keyword=data%20scientist&placement=&utm_content=c&gclid=CjwKCAjwn6GGBhADEiwAruUcKqPCvPIk1X_5mVRXj5prdpSIULnd40QgTB4kChfiFgAL1kDErGeLHRoCapUQAvD_BwE

Get Started

This repo shows a good collection of Machine learning with python and data science with algorithms,projects,explanations from basic to advance level.
It has topics based on machine learning, deep learning, sql, natural language proccessing, object detection, classification, recommendation system,chatbots and much more.

Take a look at existing projects

Content List
Advanced Visualizations
Alzheimer's Disease Predictor
Analysis And predict_Black_friday_sale
Audio Classification
Automatic Summarization of Scientific Papers
Basics of ML and DL
Basics of Power Bi
Basics of the Python
Bidirectional LSTM
Bird Species Classification Web App
Bitcoin Price Prediction Web App
Bitcoin Price Predictor
CBT_ChatBot
COVID_19-DATA-ANALYSIS
Cheat Sheets
Class Imbalance problem
Classification Algorithms
Cloud Details
Covid19 forecasting with prophet
Covid_Third_Wave_Forecasting
CrowdAI Plant Disease
Customer Segmentation using Machine Learning
Data Cleaning Techniques
Data Filling and Cleaning Techniques
Different types of Clustering
Different types of feature selection techniques
Different_types_of_scaling_Method
Driver_Drowsiness_Detection
EDA-and-Perform-Modelling-on-Ionosphere-Dataset-main
Email Classifier
Emotion Recognition Based on NLP
Ensemble Methds in ML
Explaination and Example for P value with code
Exploratory-data-analysis
Extract_Text_from_PDF_using_Python
Fake_News_Detection
File of SQL Commands
Fish-Weight-Estimation
Flight_delay_prediction_project
GDP Prediction
GUI-JARVIS
Gender Pay Gap Analysis
Google Teachable Machine
Handwritten Equation Solver using CNN
Handwritten character recognition
Heart_Predection
HollywoodMarketSynopsis
IMDB Box Office Prediction
LanguageDetection
Medical Charges for Smokers and Non-smoker
Medical_Help_Chatbot
Meteorite Landing Data Analysis
Movie-Recommendation-System
Movie-Recommender-System using python
Nasa-Asteroids-Dataset-Analysis
NumPy - Basics
Number_of_people_counter
OCR-Medicine-Reader
Object Detection
Ola Bike Ride Request Demand Forecast
Optical character recognition (OCR)
Plant Seedlings Classification
R language
Random forest from scratch
Random forest test
Rock Paper Scissors Python Game
Sentiment analysis for depression based on social media posts
Sentiment-Analysis
Skin Disease Predictor
Spam Mail Detection
Speech_Emotion_Recognition
Spelling Corrector
Sports Analytics Project
Startup_Profit_Prediction
Stock Price Analysis
Sudoku Solver using CNN
Tensorflow.js Demo
Time Series Forecasting with Python
Time-Series LSTM Model
Unique Chatbot
Various Plots using Matplot,Seaborn,Pandas
Vehicles and Pedestrian Detection
Weather Prediction
Web-Scraping-with-Beautiful-Soup-master
XgBoost_Algorithm
ensemble-methods-notebooks-master
heart failure
job_Advertisement_detection
logistic_regression_scratch
recommendation_system
.DS_Store
Analysis_of_Temperature_Rise_in_PMSM.ipynb
Beautiful Soup.ipynb
Ensemble learning.docx
Ensemble-Learning (Stacking)
Machine Hack -1.ipynb
README.md updated file
Role_from_Resume.ipynb
Sql
Statistics- Basics.ipynb
Test Task_NIket.ipynb
UBER_DATA_ANALYSIS.ipynb
Various_Plots_in_Matplotlib.ipynb
Visualization with Seaborn & Matplotlib.ipynb
buyer_s_time234.ipynb
random_forest.py

Note:

Above project list will be scheduled automatically,whenever new projects add to the repo it will add in above table.

📖 Code Of Conduct:

You can find our Code of Conduct here.

📝 License

This project follows the MIT License.

Have a look

Give it a 🌟 if you ❤ this project.
Take a look at the Existing Issues.
Create your own Issues, If you have new idea not listed in project.
Wait for the Issue to be assigned to you.
Fork the repository

Clone the repository using-

git clone https://github.com/Niketkumardheeryan/Hands-on-ML-Basic-to-Advance-

⚙️ Contribution Guidelines

Have a look at Contibuting Guidelines

Some awesome Contributors ✨

_{Niket kumar Dheeryan (Author)} 💻
_{Abhishek Sharma} 💻	_Sakalya100 💻	_{Kaustav Roy} 💻	_{Soumayan Pal} 💻	_{Komal Gupta} 💻	_{Manu Varghese} 💻
_{Abhishek Panigrahi} 💻	_{Padmini Rai} 💻	_psyduck1203 💻	_{Rutik Bhoyar} 💻	_{Ayushi Shrivastava} 💻	_{Anshul Srivastava} 💻
_{RISHAV KUMAR} 💻	_Megha0606 💻	_Jagannath8 💻	_{Harshita Nayak} 💻	_{ayushgoyal9991} 💻	_{SurajPawarstar} 💻
_{Sumit11081996} 💻	_{Tanvi Bugdani} 💻	_{Suyash Singh} 💻	_{Abhinav Dubey} 💻	_{Nisha Yadav} 💻	_{Neeraj Ap} 💻
_Nishi 💻	_{shivani rana} 💻

ML-CaPsule
ML-CaPsule copied to clipboard

Metadata

Master Machine learning

Description

🌱Pre-requisites