ud120-projects-py3-jupyter
ud120-projects-py3-jupyter copied to clipboard
Udacity ud120 Mini-Projects: Jupyter Notebooks, Python 3.8, Conda
Udacity ud120 Projects v2: python3 + jupyter + conda env
The aim of this fork is to improve original starter project code for students taking Intro to Machine Learning on Udacity with python 3.8, conda managing and jupyter notebooks.
Mini-Projects
- Lesson 2: Naive Bayes
- Lesson 3: SVM
- Lesson 4: Decision Trees
- Lesson 5: Choose Your own Algorithm
- Lesson 6: Datasets and Questions
- Lesson 7: Regressions
- Lesson 8: Outliers
- Lesson 9: Clustering
- Lesson 10: Feature Scaling
- Lesson 11: Text Learning
- Lesson 12: Feature Selection
- Lesson 13: PCA
- Lesson 14: Validation
- Lesson 15: Evaluation Metrics
- Lesson 17: Final Project
Important Notes
Lesson 3: SVM
In this repo newer version of scikit-learn is used. Thus, to get the results expected by the course grader
you need to use SVC with gamma='auto', since the default value of gamma changed, see sklearn.svm.SVC docs:
Changed in version 0.22: The default value of gamma changed from 'auto' to 'scale'.
For example:
clf = SVC(kernel='linear', gamma='auto')
Lesson 7: Regressions
To get the correct (acceptable by grader) results set sort_keys='../utils/python2_lesson06_keys.pkl' for
feature_format function:
...
data = feature_format(dictionary, features_list, remove_any_zeroes=True, sort_keys='../utils/python2_lesson06_keys.pkl')
...
[...] This will open up a file in the tools folder with the Python 2 key order.
See this for detailed explanation.
Initial Setup
1. Clone the repo
$ git clone https://github.com/trsvchn/ud120-projects-py3-jupyter.git
$ cd ud120-projects-v2
2. Set up conda environment
2.1. Download and install anaconda
2.2. Create environment
$ conda env create -f environment.yml
2.3. Activate environment via
$ conda activate ud120
3. Run starter script to check env and download required data
$ python ./utils/starter.py