# DATA 1010: Introduction to Probability, Statistics, and Machine Learning

## Resources

- Syllabus
- Learning resources
- Course text volume I
- Course text volume II [code]
- Cheatsheet of course concepts
- A Julia-Python-R cheatsheet
- Learning standards
- Calendar
- Additional reading list

## Class

- September 5 (linear algebra and programming) [solutions video]
- September 7 (more linear algebra and programming) [solutions video]
- September 10 (linear algebra and SVD) [solutions video]
- September 12 (determinants and matrix differentiation) [solutions video] [jupyter]
- September 14 (machine arithmetic) [solution]
- September 17 (numerical error) [solution]
- September 19 (pseudorandom number generators, automatic differentiation) [solution]
- September 21 (gradient descent, review) [solution]
- September 24 (probability spaces) [solution]
- September 26 (counting and random variables) [solution]
- September 28 (conditional probability and independence) [solution]
- October 1 (conditional probability) [solution]
- October 3 (review) [solution]
- October 5 (expectation) [solution]
- October 10 (linearity of expectation) [solution]
- October 12 (continuous distributions) [solution]
- October 15 (conditional expectation) [solution]
- October 17 (more continuous distributions, Bernoulli and binomial distributions) [solution]
- October 19 (geometric, Poisson, exponential distributions) [solution]
- October 22 (multivariate normal distribution) [solution]
- October 24 (law of large numbers and CLT) [solution]
- October 26 (CLT and multivariate CLT) [solution]
- October 29 (Kernel density estimation) [solution]
- October 31 (Kernel density estimation, review) [solution]
- November 2 (Nonparametric regression) [solution]
- November 5 (intro to classification, QDA) [solution]
- November 7 (classification, LDA, Naive Bayes) [solution]
- November 9 (logistic regression) [solution]
- November 12 (support vector classification) [solution]
- November 14 (kernelization for SVM, neural nets)
- [November 17 — December 10] (neural nets, dimension reduction, likelihood ratio classification, intro to R,
`ggplot2`

,`dplyr`

, point estimation, confidence intervals, empirical CDF convergence, maximum likelihood estimation, hypothesis testing) - December 12 (geographic maps in
`ggplot2`

, and logistic regression using`caret`

)

## Homework

- PSet 1 - September 14 (linear algebra, SVD) [solution]
- PSet 2 - September 21 (matrix differentiation, machine arithmetic, PRNGs) [solution]
- PSet 3 - September 28 (automatic differentiation, gradient descent, probability, review problems) [solution]
- PSet 4 - October 5 (review problems, probability) [solution]
- PSet 5 - October 12 (expectation) [solution]
- PSet 6 - October 19 (continuous distributions, conditional expectation) [solution]
- PSet 7 - October 26 [files] (common distributions, central limit theorem) [solution]
- PSet 8 - November 02 (probability review) [solution]
- PSet 9 - November 09 (kernel density estimation, nonparametric regression, classification) [solution]
- PSet 10 - November 16 (logistic regression, support vector machines, neural nets) [solution]
- PSet 11 - November 30 (dimension reduction, likelihood ratio classification, data visualization and manipulation) [solution]
- PSet 12 - December 17 (point estimation, confidence intervals, bootstrap, and maximum likelihood estimation) [solution]

## Videos

- Linear algebra overview
- Eigenvectors and SVD
- Determinants and matrix differentiation
- Machine arithmetic
- Numerical error
- PRNGs, autodiff, and gradient descent
- Probability models
- Conditional probability

## Exams

- Practice Midterm I [solution]
- Practice Midterm II [solution]
- Practice Midterm III [solution]
- Practice Final [solution]
- Final Exam [solution]

## Animations

- Neural net forward propagation
- Neural net convergence
- Singular value decomposition
- Hard-margin SVM
- Logistic regression convergence