John M. Noble
Institute of Applied Mathematics
University of Warsaw
October 2022 - January 2023
Type of course:
Place and Time
There will be 14 lectures and 14 tutorials. These take place on Mondays, with one additional Wednesday (2nd November). Lecture: 08.30 - 10.00 (room 5060) and tutorial 10.15 - 11.45 (room 2044: computer lab). The dates are:
10th, 17th, 24th
2nd, 7th, 14th, 21st, 28th
5th, 12th 19th
9th, 16th 23rd
The course ‘Multivariate Statistics’ is a Master's level course, giving some statistical theory, with application in R.
The topics covered are:
- The data matrix, geometrical representations and distances.
- Principal Component Analysis
- Canonical Correlation Analysis.
- Non-parametric Density Estimation: histograms, kernel density estimation methods, optimal bin width, projection pursuit methods for multivariate densities.
- Discriminant Function Analysis.
- Clustering techniques, including logistic regression, self organising maps (SOM) and the EM algorithm as a tool for clustering and semi-supervised learning.
- Asymptotic log likelihood ratio tests; Wald, Rao, Pearson; logistic regression.
- Generalised Linear Models.
- Model selection criteria (for example: AIC, BIC)
- Shrinkage methods for linear regression.
- The multivariate Gaussian distribution, parameter estimation, the Wishart distribution.
- Statistical tests for multivariate Gaussian data.
Introduction to R
You should learn some R programming throughout the course. A reasonable introduction may be found here.
Assessment is based on
two data analysis assignments
Tutorial participation will also be taken into account.
Lecture and Tutorial Notes
Click here for the data directory.
(Last updated: 7th December 2022 by John M. Noble)