John M. Noble
Institute of Applied Mathematics
University of Warsaw
October 2021 - January 2022
Type of course:
The course ‘Multivariate Statistics’ is a Master's level course, which builds on the foundations laid in the course Statistics
and draws heavily on the concepts from that course.
The topics covered are:
- Asymptotic log likelihood ratio tests; Wald, Rao, Pearson; logistic regression.
- Generalised Linear Models.
- Model selection criteria (for example: AIC, BIC)
- Shrinkage methods for linear regression.
- The multivariate Gaussian distribution, parameter estimation, the Wishart distribution.
- Statistical tests for multivariate Gaussian data.
- The data matrix, geometrical representations and distances.
- Principal Component Analysis
- Canonical Correlation Analysis.
- Non-parametric Density Estimation: histograms, kernel density estimation methods, optimal bin width, projection pursuit methods for multivariate densities.
- Discriminant Function Analysis.
- Clustering techniques, including logistic regression, self organising maps (SOM) and the EM algorithm as a tool for clustering and semi-supervised learning.
Introduction to R
You should learn some R programming throughout the course. A reasonable introduction may be found here.
Assessment is based on
- A written examination (60 %)
- Two data analysis assignments (40 %)
Tutorial participation will also be taken into account.
To pass the course, it is necessary to pass the written examination on multivariate statistical theory and also to submit satisfactory computer assignments.
Lecture Notes, Tutorial Exercises and Solutions, Examination
Click here for a pdf of the lecture notes, tutorial exercises and solution and the examinations (theoretical and practical).
Click here for the data directory.
(Last updated: 10th February 2022 by John M. Noble)