John M. Noble

Mathematical Statistics

Institute of Applied Mathematics

University of Warsaw

October 2022 - January 2023

## Multivariate Statistics

## Course Information

**Language:** English

**Type of course:** elective

## Place and Time

There will be 14 lectures and 14 tutorials. These take place on Mondays, with one additional Wednesday (2nd November). Lecture: 08.30 - 10.00 (room 5060) and tutorial 10.15 - 11.45 (room 2044: computer lab). The dates are:

**October 2022** 10th, 17th, 24th

**November 2022** 2nd, 7th, 14th, 21st, 28th

**December 2022** 5th, 12th 19th

**January 2023** 9th, 16th 23rd

## Description

The course ‘Multivariate Statistics’ is a Master's level course, giving some statistical theory, with application in R.

The topics covered are:## Introduction to R

You should learn some R programming throughout the course. A reasonable introduction may be found here.

## Assessment

Assessment is based on
two data analysis assignments
Tutorial participation will also be taken into account.

## Lecture and Tutorial Notes

## Data Files

Click here for the data directory.

*(Last updated: 7th December 2022 by John M. Noble)*

Mathematical Statistics

Institute of Applied Mathematics

University of Warsaw

October 2022 - January 2023

The topics covered are:

- The data matrix, geometrical representations and distances.
- Principal Component Analysis
- Canonical Correlation Analysis.
- Non-parametric Density Estimation: histograms, kernel density estimation methods, optimal bin width, projection pursuit methods for multivariate densities.
- Discriminant Function Analysis.
- Clustering techniques, including logistic regression, self organising maps (SOM) and the EM algorithm as a tool for clustering and semi-supervised learning.
- Asymptotic log likelihood ratio tests; Wald, Rao, Pearson; logistic regression.
- Generalised Linear Models.
- Model selection criteria (for example: AIC, BIC)
- Shrinkage methods for linear regression.
- The multivariate Gaussian distribution, parameter estimation, the Wishart distribution.
- Statistical tests for multivariate Gaussian data.

- 2022-10-10 08.30 - 10.00 Lecture 1: Geometrical Representation of Data
- 2022-10-10 10.15 - 11.45 Tutorial 1
- 2022-10-10 10.15 - 11.45 R file for Tutorial 1
- 2022-10-17 08.30 - 10.00 Lecture 2: Clustering
- 2022-10-17 10.15 - 11.45 Tutorial 2
- 2022-10-17 10.15 - 11.45 R file for Tutorial 2
- 2022-10-24 08.30 - 10.00 Lecture 3: Kernel Density Estimation
- 2022-10-24 10.15 - 11.45 Tutorial 3
- 2022-10-24 10.15 - 11.45 R file for Tutorial 3
- 2022-11-02 08.30 - 10.00 Lecture 4: Generalised Linear Models I
- 2022-11-02 10.15 - 11.45 Tutorial 4
- 2022-11-02 10.15 - 11.45 R file for Tutorial 4
- 2022-11-07 08.30 - 10.00 Lecture 5: Generalised Linear Models II
- 2022-11-07 10.15 - 11.45 Tutorial 5
- 2022-11-07 10.15 - 11.45 R file for Tutorial 5
- 2022-11-14 08.30 - 10.00 Lecture 6: Model Selection Criteria
- 2022-11-14 10.15 - 11.45 Tutorial 6
- 2022-11-14 10.15 - 11.45 R file for Tutorial 6
- 2022-11-21 08.30 - 10.00 Lecture 7: Regression: Shrinkage Methods (I)
- 2022-11-21 10.15 - 11.45 Tutorial 7
- 2022-11-21 10.15 - 11.45 R file for Tutorial 7
- 2022-12-05 Assignment 1 Exercises (Due: 2022-12-05 at 8.00 am)
- 2022-11-28 08.30 - 10.00 Lecture 8: Regression: Shrinkage Methods (II)
- 2022-11-28 10.15 - 11.45 Tutorial 8
- 2022-11-28 10.15 - 11.45 R file for Tutorial 8
- 2022-12-05 08.30 - 10.00 Lecture 9: Canonical Correlation Analysis
- 2022-12-05 10.15 - 11.45 Tutorial 9
- 2022-12-05 10.15 - 11.45 R file for Tutorial 9
- 2022-12-12 08.30 - 10.00 Lecture 10: Wishart Distribution and Hotelling Test
- 2022-12-12 10.15 - 11.45 Tutorial 10
- 2022-12-12 10.15 - 11.45 R file for Tutorial 10
- 2022-12-19 08.30 - 10.00 Lecture 11: Discriminant Function Analysis
- 2022-12-19 10.15 - 11.45 Tutorial 11
- 2022-12-12 10.15 - 11.45 R file for Tutorial 11
- 2023-02-06 Examination Exercises (Due: 2023-02-06 at 13:00)
- 2023-01-09 08.30 - 10.00 Lecture 12: Classification and Regression Trees
- 2023-01-09 10.15 - 11.45 Tutorial 12
- 2023-01-09 10.15 - 11.45 R file for Tutorial 12
- 2023-01-16 08.30 - 10.00 Lecture 13: Support Vector Machines
- 2023-01-16 10.15 - 11.45 Tutorial 13
- 2023-01-16 10.15 - 11.45 R file for Tutorial 13
- 2023-01-23 08.30 - 10.00 Lecture 14: Bagging, Boosting and Random Forests
- 2023-01-23 10.15 - 11.45 Tutorial 14
- 2023-01-23 10.15 - 11.45 R file for Tutorial 14