datamining290
datamining290 copied to clipboard
Data Mining and Analytics in Intelligent Business Services, UC Berkeley School of Information
-
Data Mining 290 :slide:
- Description :: Learn how to obtain, clean, visualize, understand, model, and predict the world around you using data. Grading will consist of homework (30%), midterm (30%), project (40%).
- Instructor :: Jim Blomo jblomo@ischool
- GSI :: Shreyas shreyas@ischool
- Textbook :: Han, J., Kamber, M., & Pei, J. (2011). Data Mining: Concepts and Techniques, Third Edition (3rd ed.). Morgan Kaufmann.
-
Syllabus :slide: DM[0-9]+ indicates chapters from the text, Data Mining.
| Date | Readings | Slides | Homework / Project | |------+----------+--------+--------------------| | Jan 25 | [[http://try.github.com][Try Github]] ; [[http://www.dataists.com/2010/09/a-taxonomy-of-data-science/][A Taxonomy of Data Science]] | [[file:slides/2013-01-25-Intro.html][Class Intro]] ; Tools Intro by /GUEST: Shreyas/ | [[ https://github.com/seekshreyas/Introduction-to-Git-Github][Git Intro]] | | Feb 1 | DM1 ; [[http://hbswk.hbs.edu/item/6836.html][The Yelp Factor: Are Consumer Reviews Good for Business?]] | [[file:slides/2013-02-01-CaseStudies.html][Case Studies]] ; [[file:slides/2013-02-01-Obtaining-Data.html][Obtaining Data]] | [[file:slides/2013-02-01-Lab.html][Obtain & Explore Data]] | | Feb 8 | DM2, DM3 | [[file:slides/2013-02-08-Probability.html][Probability]] ; [[file:slides/2013-02-08-Preprocessing.html][Preprocessing]] | [[file:slides/2013-02-08-Lab.html][Data Stats]] | | Feb 15 | DM4, [[http://www.youtube.com/watch?v=SS27F-hYWfU][Apache Hadoop: Petabytes and Terawatts]] ([[http://prezi.com/u0ukvqzpyh5p/apache-hadoop-petabytes-and-terawatts/][slides]]); [[http://packages.python.org/mrjob/][mrjob docs]] (for homework) | [[file:slides/2013-02-15-Data-Warehouse.html][Data Warehouse]] ; [[file:slides/2013-02-15-MapReduce.html][MapReduce]] | [[file:slides/2013-02-15-Project.html][Project Details]] ; [[file:slides/2013-02-15-mrjob.html][mrjob]] | | Feb 22 | DM8 | [[file:slides/2013-02-22-Decision-Trees.html][Decision Trees]]; [[file:slides/2013-02-22-Bayes.html][Naive Bayes]] | [[file:slides/2013-02-22-Gini.html][Gini Index]] | | Mar 1 | DM[9.1-9.3], 9.5 ; [[http://scott.fortmann-roe.com/docs/BiasVariance.html][Understanding the Bias-Variance Tradeoff]] | [[file:slides/2013-03-01-SVM.html][SVM]] ; [[file:slides/2013-03-01-Neural-Network.html][Neural Networks]] | [[file:slides/2013-03-01-Lab-NN.html][Neural Network Back Propagation]] | | Mar 8 | DM10 | [[file:slides/2013-03-07-Clustering.html][Agglomerative - Clustering]] ; [[file:slides/2013-03-07-Hierarchical.html][Hierarchical, Density - Clustering]] | [[file:slides/2013-03-07-k-means.html][K-Means]] | | Mar 15 | DM11.1 | [[file:slides/2013-03-15-Review.html][Review]] | prepare 1 cheat sheet | | Mar 22 | 1 cheat sheet | Midterm | - | | Mar 29 | HOLIDAY | Apr 5 | DM6 | [[file:slides/2013-03-15-Advanced-Cluster.html][Advanced Clustering]] ; [[file:slides/2013-04-05-Frequent-Pattern.html][Frequent Pattern]] | [[file:slides/2013-04-05-AWS.html][AWS]] ; Project Proposal Due | | Apr 12 | DM11.3; [[http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf][PageRank]]; [[http://arxiv.org/pdf/1106.5321][Uncovering Social Network Sybils in the Wild]] | [[file:slides/2013-04-12-Graphs.html][Graphs]]; [[file:slides/2013-04-12-PageRank.html][PageRank]] | [[file:slides/2013-04-12-AdjacencyRepresentations.html][Adjacency Representations]] | | Apr 19 | [[file:slides/2013-04-19-Nonlinear.pdf][Non-linear regression]] | GUEST: Gene Lee Ceaser's [[file:slides/RM Pricing Strategy.ppt][Pricing Strategy]]; [[file:slides/Campus Recruiting Deck_2012_UC Berkeley.ppt][Ceaser's Recruiting]]| [[file:slides/2013-04-19-Elasticity.html][Price Elasticity]] | | Apr 26 | DM12; [[http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf][Shazam Audio Search]] | [[file:slides/2013-04-26-Outliers.html][Outliers]]; [[file:slides/2013-04-26-Multimedia.html][Images & Audio]] | [[file:slides/2013-04-26-Midterm-HW.html][Midterm Review]] | | May 3 | [[https://groups.google.com/group/gsofgs/attach/2f1cdd7a999c3ad8/embedded-plots.pdf?part=2&authuser=0][Embedded Plots]] ; [[http://vis.stanford.edu/files/2011-D3-InfoVis.pdf][Data-Driven Documents]]| [[file:slides/2013-05-03-Visualization.html][Visualization]] ; [[file:slides/2013-05-03-Yelp-Visualization.html][Yelp's Visualizations]] | [[http://vogievetsky.github.io/IntroD3/][D3 Intro]] [[file:slides/2013-05-03-D3.html][D3 Lab]] | | May 10 | [[http://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf][A Few Useful Things to Know about Machine Learning]] ; [[http://www.cs.uvm.edu/~icdm/algorithms/10Algorithms-08.pdf][Top 10 Algorithms in Data Mining]] | [[file:slides/2013-05-10-Real-World.html][In Real Life]] ; Presentations | May 16th: Project Papers Due | | May 17 | - | Final Presentation | Bye! |
#+STYLE: #+STYLE: #+STYLE: #+STYLE:
#+BEGIN_HTML