# CSE176 Introduction to Machine Learning (Fall semester 2016)

## Instructor

Miguel Á. Carreira-Perpiñán
Professor
Electrical Engineering and Computer Science
School of Engineering
University of California, Merced
mcarreira-perpinan-[at]-ucmerced.edu; 209-2284545
Office: 217, Science & Engineering Building 2

Office hours: Tuesdays/Wednesdays 2:45-3:45pm (SE2-217).

TA: Yerlan Idelbayev, yidelbayev-[at]-ucmerced.edu. TA hours: Mondays 3:30-4:30pm (AOA142).

Lectures: Mondays/Wednesdays 1:30-2:45pm (COB265).

Lab class: Fridays 1:30-4:20pm (Linux Lab, SE1-138).

Course web page: http://faculty.ucmerced.edu/mcarreira-perpinan/teaching/CSE176

## Course description

Survey of techniques for the development and analysis of software that learns from experience. Specific topics include: supervised learning (classification, regression); unsupervised learning (clustering, dimensionality reduction); reinforcement learning; computational learning theory. Specific techniques include: Bayesian methods, mixture models, decision trees, instance-based methods, neural networks, kernel machines, ensembles, and others.

Prerequisites: CSE31 Computer Organization and Assembly Language, MATH24 Linear Algebra and Differential Equations, MATH32 Probability and Statistics. Essentially, you need to know the fundamentals of linear algebra, multivariate calculus and probability, and have good programming skills. It is also recommended to have taken CSE100 Algorithm Design and Analysis. You also need to know (or be able to learn quickly) Matlab programming at a reasonably proficient level.

More specifically, these are the most important concepts you need to know:

• Calculus: partial derivatives of functions of several variables, gradients.
• Linear algebra: subspaces, linear independence, rank and dimension; orthogonal projections; eigenvalues and eigenvectors of real symmetric matrices and their properties; positive definite matrices and their properties.
• Basic probability: marginal and conditional distributions; Bayes theorem.
• Multivariate statistics: moments, distributions (Gaussian, Bernoulli, multinomial).

I emphasize that a solid knowledge of these concepts is strictly necessary to do well in this course. Below I give some resources you can use the help you refresh this knowledge if necessary; make sure to do so from day one.

## Textbook

Required textbook (get the errata and additional errata):

The companion site for the book has additional materials (lecture slides, errata, etc.).

Other books recommended as additional reading at undergraduate level:

If you want to refresh your knowledge of linear algebra, multivariate calculus or probability, the following are helpful (any edition or similar book is fine):

• Gilbert Strang: Introduction to Linear Algebra. Wellesley-Cambridge Press.
• Murray Spiegel: Schaum's Outline of Vector Analysis. McGraw-Hill.
• Murray Spiegel: Schaum's Outline of Theory and Problems of Probability and Statistics. McGraw-Hill.

The textbook has an appendix on probability. Some of the other books have similar appendices on linear algebra, multivariate calculus or probability. Also, Stanford's machine learning class provides nice reviews of linear algebra and probability theory.

## Syllabus and required textbook reading

Syllabus

Before each class, you should have read the corresponding part of the textbook and the notes. I will teach the material in the order below (which is more or less the order in the book).

Textbook reading (table of contents):

• Ch. 1 Introduction: all.
Exercises: 1-3, 5, 7.
• Ch. 2 Supervised learning: all except sections 2.2-2.3.
Exercises: 1-7, 10-11.
• Ch. 3 Bayesian decision theory: all.
Exercises: 1-4, 7-9, 11.
• Ch. 4 Parametric methods: all.
Exercises: 1-10.
• Ch. 5 Multivariate methods: all.
Exercises: 1-9.
• Ch. 6 Dimensionality reduction: sections 6.1-6.3, 6.7-6.8.
Exercises: 1-3, 5, 8.
• Ch. 7 Clustering: all except sections 7.5-7.7 and the derivation between eqs. (7.7) and (7.11).
Exercises: 2, 6-11.
• Ch. 8 Nonparametric methods: all.
Exercises: 2-4, 8, 10-11.
• Ch. 9 Decision trees: sections 9.1-9.4.
Exercises: 1-2.
• Ch. 10 Linear discrimination: all except section 10.9.
Exercises: 2-5.
• Ch. 11 Multilayer perceptrons: all except sections 11.10, 11.12.
Exercises: 1-5, 7, 12.
• Ch. 12 Local models: sections 12.3, 12.5.
Exercises: 1-4.
• Ch. 13 Kernel machines: sections 13.1-13.3, 13.5-13.6, 13.9-13.10.
Exercises: 6.
• Ch. 14 Graphical models: sections 14.1-3, 14.6-14.7.
Exercises: 1-5.
• Ch. 15 Hidden Markov models: sections 15.1-4.
Exercises: 1-3.
• Ch. 16 Bayesian estimation: nothing.
• Ch. 17 Combining multiple learners: all except section 17.8, 17.10.2 and 17.11.
Exercises: 1.
• Ch. 18 Reinforcement learning: 18.1, 18.3-4.
Exercises: 1, 3, 5.
• Ch. 19 Design and analysis of machine learning experiments: sections 19.1, 19.7.
Exercises: 1.

## Handouts

The notes to accompany the textbook (bring the corresponding part to each class):
Carreira-Perpiñán, M. Á. (2016): CSE176 Introduction to Machine Learning: Lecture notes. University of California, Merced, 2015-2016. See also last year's notes.

The lecture notes are intended to summarize and sometimes expand or explain differently the main concepts, but they are not a substitute for the book.

## Labs

Each lab consists of an assignment to be done in groups of 3 students. Your solution to the assignment is due by Thursday 11:59pm on the week following the lab by email to the TA as a single compressed file (e.g. lab10.tar.gz). Late work will receive a grade of zero. During the lab session, the TA will explain the assignment and demonstrate what the solution should work like, so it is important you attend the lab session. You can work during the rest of the lab session and throughout the week on the assignment. If you have questions, ask the TA during the lab or TA office hours, or by email.

The following directories contain supporting material for the labs:

• assignments, including representative figures that show the output of an algorithm
• datasets, including code to create and visualize them
• code: various machine learning algorithms that you can use

Some practical advice for the labs, including Matlab tips.

## Homeworks

Homework (to do on your own, graded):

• Homework #1 (covering chapters 1-5 from the book): due Oct. 7 during the lab session.
• Homework #2 (covering chapters 6-9 from the book): due Oct. 14 during the lab session.
• Homework #3 (covering chapters 10-18 from the book): due Dec. 7 at the beginning of the lecture.

Late homeworks will receive a grade of zero. You can submit your homework solutions in (legibly) handwritten paper, no need to type it or scan it. We'll give it back to you with the grades and any corrections.

## Project

Optional project (to do on your own, for extra credit), in groups of 4 students, due Dec. 14 11:59pm PST by email to the TA.

## Course grading

• Midterm exam (20%): in-class, closed-book, consisting of problems and conceptual questions. It will cover ch. 1-9 inclusive.
• Final exam (20%): as for the midterm. It will cover the entire course, but mostly focusing on the part after the midterm.
• Lab assignments (40%): these consist of programming selected algorithms in Matlab, exploring their behavior with different datasets and writing a report analyzing the results. You will have one week to submit each lab.
• Homeworks (20%): exercises and problems similar to those in the textbook to be submitted approximately biweekly.
• Optional project for up to 20% extra credit.

Note: to pass the course, your grades in both the labs and the exams cannot be too low (that is, they cannot be compensated by higher grades in, say, the homeworks or the project).

While I encourage you to discuss your work with other students, the homeworks, lab assignments, project and exams must be the result of your own work without collaboration. See the Academic Dishonesty Statement and the UC Merced Academic Honesty Policy.

Grade curves (exams): midterm, final.

## Matlab tutorials

Matlab tutorials: if you have never used Matlab, there are many online tutorials, for example:

Also see Matlab courses from Engineering Service Learning at UC Merced.

## Links

Miguel A. Carreira-Perpinan
Last modified: Wed Nov 30 19:44:58 PST 2016

UC Merced | EECS | MACP's Home Page