CSCI 567
Home
Syllabus
Schedule
Resources
Projects
Blackboard
|
Machine Learning (CSCI-567)
Fall 2007
General Information
| Location: | | |
|
Where: |
GFS 118 |
|
When: |
T-Th, 5:00-6:20pm |
| Instructor: | | |
|
Sofus A. Macskassy |
|
Office: |
SAL 216 |
|
Office Hours: |
By appointment
Send me an email and I will be in the office before class. |
|
Phone: |
310-414-9849 x247 |
|
E-mail: |
csci567@usc.edu
NOTE: E-mail is the best way to reach me.
|
| Teaching Assistant: | | |
|
Cheol Han |
|
Office: |
TBA |
|
Office Hours: |
TBA |
|
E-mail: |
cheolhan at usc dot edu |
News
- 11/15/07: Added link to
adult-train-small.arff on schedule page.
- 11/15/07>: Unsupervised learning slides (lectures 23+24) available on schedule page.
- 11/13/07: Updated grading policies on projects and syllabus pages.
- 11/9/07: Penalty methods and evaluation slides (lectures 21+22) available on schedule page.
- 11/8/07: Homework 4 is ready on schedule page.
- 11/6/07: Overfitting slides (lectures 19-20) available on schedule page.
- 10/29/07: Bias-Variance slides (lectures 17+18) and hw4 available on schedule page. (updated 10pm)
- 10/29/07: New projects page added. (updated at 10pm).
- 10/17/07: Learning theory slides (lectures 15+16) available on schedule page.
- 10/9/07: Lecture 14 slides available on schedule page.
- 10/8/07: Lecture 13 slides available on schedule page.
- 10/8/07: midterm study guide available.
- 10/3/07: Lecture 12 slides available on schedule page.
- 10/1/07: Lecture 11 slides available on schedule page.
- 9/27/07: Lecture 10 slides available on schedule page.
- 9/25/07: schedule page is updated.
- 9/25/07: Lecture 9 slides updated (typos fixed) on schedule page.
- 9/25/07: Lecture 9 slides and HW3 now on schedule page.
- 9/20/07: schedule page is updated.
- 9/20/07: Lecture 8 slides now on schedule page.
- 9/18/07: Homework 2 is ready on schedule page.
- 9/18/07: Lecture 7 slides now on schedule page.
- 9/14/07: schedule page is updated.
- 9/13/07: Lecture 6 slides now on schedule page.
- 9/12/07<: Homework 1 Clarifications is updated regarding gradiants.
- 9/11/07: Lecture 5 slides now on schedule page.
- 9/11/07: Homework 1 Clarifications.
- Homework 1 is ready on schedule page. I have extended due date to September 18.
- Lecture 4 slides now on schedule page.
- Lecture 3 slides now on schedule page.
- Schedule has been updated (and will be updated based on how things progress).
- Lecture slides should now be accessible.
- Lecture 2 slides now on schedule page.
- Lecture 1 slides now on schedule page.
- Lecture slides will be put on the schedule page following the lecture.
Course Description
This course will present an introduction to algorithms for machine
learning and data mining. These algorithms lie at the heart of many
leading edge computer applications including optical character
recognition, speech recognition, text mining, document classification,
pattern recognition, computer intrusion detection, and information
extraction from web pages. Every machine learning algorithm has both a
computational aspect (how to compute the answer) and a statistical
aspect (how to ensure that future predictions are accurate). Algorithms
covered include linear classifiers (Gaussian maximum likelihood, Naive
Bayes, and logistic regression) and non-linear classifiers (neural
networks, decision trees, support-vector machines, nearest neighbor
methods). The class will also introduce techniques for learning from
sequential data and advanced ensemble methods such as bagging and
boosting.
Prerequisites: basic knowledge of search algorithms, probability,
statistics, calculus, data structures, search algorithms (gradient
descent, depth-first search, greedy algorithms), linear algebra.
Some AI background is recommended, but not required.
Textbook:
- The main textbook that we will be using is Ethem Alpaydin, "Introduction to Machine Learning", MIT Press, 2004.
Errata: http://www.cmpe.boun.edu.tr/~ethem/i2ml/
-
Additional recommended readings are:
- Tom Mitchell, "Machine Learning", McGraw-Hill, 1997.
- Richard O. Duda, Peter E. Hart & David G. Stork, "Pattern Classification. Second Edition", Wiley & Sons, 2001. (Make sure your copy is not the first printing (or go to David Stork's web page and download the bug fixes).
Errata files: http://rii.ricoh.com/~stork/DHS.html
- Trevor Hastie, Robert Tibshirani and Jerome Friedman, " The elements of statistical learning", Springer, 2001.
-
- Lecture notes and other relevant materials are available on this web page.
Course Handouts
Software
In this class, we will be using the WEKA
of Waikato (Hamilton, New Zealand). This is a package of machine
learning algorithms and data sets that is very easy to use and easy to
extend.
Homework Assignments
- homework assignments will be listed here and on the schedule.
Please turn in all homework in two forms: (i) as hardcopy at the start
of class and if applicable: (ii) electronically to TA and instructor.
Written Homework and Programs are due at the beginning of class.
Some guidelines:
- If you deliver after class but before 8am next day, then you get 25% off
- If you deliver next day after 8am next day, then you get 50% off
- If you do not deliever next day, then you get no points
- If you have a valid excuse for not turning in your homework, then you need to let me know ASAP with proper documentation. No exceptions.
- If you have problems with grading, see the TA. If you request a re-grading, then the whole assignment will be regraded (so you run the small risk of possibly losing points as well).
Each student is responsible for his/her own work. The standard
departmental rules for academic dishonesty apply to all assignments in
this course. Collaboration on homeworks and programs should be
limited only to answering questions that can be asked and answered
without using any written medium (e.g., no pencils, instant messages,
or email).
|