Lab 6: The importance of the ploting¶

Task 1) Analyze the dataset: ans.csv¶

Calculate the missing values (fill the tables).

Property Value Accuracy
Mean of x ? ?
Sample variance of x ? ?
Mean of y ? ?
Sample variance of y ? ?
Correlation between x and y ? ?
Linear regression line (y = a+bx) ? ?
Coefficient of determination of the linear regression ? ?

Scatter plot for Task 1

                            [scatter plot here]



Task 2) Analyze the dataset: ans2.tsv¶

Calculate the missing values (fill the tables).

Property Value Accuracy (up to 3 places)
Mean of x ? ?
Mean of y ? ?
SD of x ? ?
SD of y ? ?
Corr ? ?

Scatter plot for Task 2 (multiple subplots in single panel)

                            [scatter plot here]



Homework:¶

Make the report (both ipynb and rendered HTML) with the filled tables and scatter plots for:

  • point 1, one single plot similar to the one from Wikipedia
  • point 2, in this case make separte subplots for each dataset (the plot in structure should look similar to the plots from Task5 in Lab5&6 (multiple subplots in single panel)

Can you guess what "d" and "s" stand for in given datasets of ans2.tsv?

The report should contain:

  • the main report file in html (with all the plots embedded)
  • the jupyter notebook*

* thus this time no .py scripts as the python code should be included in jupyter/html


The homework should be sent until 05.04.25 via Email with 'DAV25_lab6_hw_Surname_Name.7z' (ASCII letters only) attachment.

Using non-English labels, legends, descriptions, etc. will be scored -10%

Additionally, all problems with the structure of the plot e.g. the plot size, labels font size, etc. will also affect the grading. You need to follow advice included in the lectures.

Epilog: Read the article Same Stats, Different Graphs