Miguel Á. Carreira-Perpiñán
Professor
Electrical Engineering and Computer Science
School of Engineering
University of California, Merced
mcarreira-perpinan-[at]-ucmerced.edu; 209-2284545
Office: 217, Science & Engineering Building 2
Office hours: Wednesdays 3:15-4:15pm (SE2-217).
TA: Yerlan Idelbayev, yidelbayev-[at]-ucmerced.edu. TA hours: Fridays 1:30-3:20pm, COB2-265.
Lectures: Tuesdays/Thursdays 4:30-5:45pm (COB113).
Lab class: Wednesdays 10:30am-1:20pm (KOL202).
Course web page: http://faculty.ucmerced.edu/mcarreira-perpinan/teaching/CSE176
Survey of techniques for the development and analysis of software that learns from experience. Specific topics include: supervised learning (classification, regression); unsupervised learning (clustering, dimensionality reduction); reinforcement learning; computational learning theory. Specific techniques include: Bayesian methods, mixture models, decision trees, instance-based methods, neural networks, kernel machines, ensembles, and others.
Prerequisites:
Essentially, you need to know the fundamentals of linear algebra, multivariate calculus and probability, and have good programming skills. More specifically, these are the most important concepts you need to know:
I emphasize that a solid knowledge of these concepts is strictly necessary to do well in this course. Below I give some resources you can use the help you refresh this knowledge if necessary; make sure to do so from day one. You also need to know (or be able to learn quickly) Matlab programming at a reasonably proficient level.
We will use the following lecture notes:
Carreira-Perpiñán, M. Á. (2016): CSE176 Introduction to Machine Learning: Lecture notes. University of California, Merced, 2015-2016.
There is no required textbook (we will use the lecture notes above), but these are some books recommended as additional reading at the undergraduate level:
If you want to refresh your knowledge of linear algebra, multivariate calculus or probability, the following are helpful (any edition or similar book is fine):
Some of the machine learning books above have appendices on linear algebra, multivariate calculus or probability. Also, the following are useful:
You can also ask math questions at The Math Center.
Before each class, you should have read the corresponding part of the notes. I will teach the material roughly in the order below:
The labs are intended 1) to do some pen-and-paper exercises and 2) to explore machine learning in Matlab. Each lab explores a different machine learning problem or algorithm. We will provide you with sample code implementing an algorithm. The TA will demonstrate it with an example and visualize its results; then you will explore it on other examples, and modify the code to do other things. There is nothing to submit but attendance is mandatory and we will take into account your participation in the grade. The code we provide is very instructive and you will get considerable insights into machine learning if you spend time exploring it. You are encouraged to continue to explore the algorithms on your own once the lab is over. Ask any questions to the TA during the lab or TA office hours, or by email.
The labs will use Matlab, so start learning right away it if you have never used it. The first lab sessions are an introduction to machine learning and Matlab.
The following directories contain supporting material for the labs:
Homework (graded, to do individually or in groups of up to 2 students):
You must submit your solutions both as a hardcopy (legibly handwritten or printout) and as a PDF file (scanned or created by a word processor, no larger than 1 MB). We'll give the hardcopy back to you with the grades and any corrections. In the week of the deadline (or before), submit it during your lab section. Each homework counts equally for grading purposes. Late homeworks will receive a grade of zero.
Project, in groups of up to 3 students: it will consist of running some machine learning algorithm on real-world datasets, evaluating its performance as a written report and an oral presentation at the end of the course. The project is due Dec. 3 (by email to the TA), and the oral presentations on Dec. 10 in the lecture.
This year's project: decision forests.
Note: to pass the course, your grades cannot be too low in any of these parts (that is, a very low final exam grade cannot be compensated by higher grades in, say, the homeworks or the project).
While I encourage you to discuss your work with other students, the homeworks, project and exams must be the result of your own work without collaboration. See the Academic Dishonesty Statement and the UC Merced Academic Honesty Policy.
Grade curves (exams): midterm, final.
While I encourage you to discuss your work with other students, individual assignments (the exams) must be the result of your own work, and group assignments (homeworks and project) must be the result of your group's work, without collaboration. Cheating causes two problems: you learn less well, and it is unfair to students who put honest effort into their work. See the Academic Dishonesty Statement in the syllabus, the CSE Department Policy on Academic Honesty and the UC Merced Academic Honesty Policy. Importantly: should copying occur, both the student who copied work from another student and the student who gave material to be copied will both automatically receive a zero for the assignment. A repeated violation will directly lead to an F. A single violation will directly lead to an F if the School of Engineering determines that the student has had a prior violation in any other course.
For both homeworks and project: you must disclose whatever sources you used to complete your work or to help others complete theirs. For homeworks, write it at the beginning of your submission. For the project, write it at the beginning of your report. Examples of possible sources:
If you have never used Matlab, there are many online tutorials, for example:
As well as books:
Engineering Service Learning at UC Merced offers courses on Matlab, Python, R and other things.