Lab11

Graphical Models - part 2

Material based on:

Book: Graphical Models with R. Højsgaard, Søren, Edwards, David, Lauritzen, Steffen
Lecture Graphical Models - Graphs and Markov properties with R

The R–packages gRbase, gRain and gRim.

#gRbase dependencies
source("http://bioconductor.org/biocLite.R")
biocLite(c("graph", "RBGL", "Rgraphviz"))

install.packages('gRbase')
install.packages('gRain')
install.packages('gRim')

library(graph); library(RBGL); library(Rgraphviz)
library(gRbase); library(gRain); library(gRim)

In previous part we learned basic concepts behind graphical models (including DAGs), their construction and manipulation. In this part, we will learn how to use it in practice.

Bayesian Network (BN) basics

A Bayesian network is a often understood to be graphical model based on a directed acyclic graph (a DAG).
A BN typically will typically satisfy conditional independence restrictions which enables computations of updated probabilities for states of unobserved variables to be made very efficiently.
The DAG only is used to give a simple and transparent way of specifying a probability model.
The computations are based on exploiting conditional independencies in an undi- rected graph.
Therefore, methods for building undirected graphical models can just as easily be used for building BNs.

Exercise 1

Consider the following narrative: Having flu (F) may cause elevated temperature (T). Elevated tempearture may cause a headache (H).

Your task: Illustrate this narrative as a DAG:

The output should look like:

Specification of conditional probability tables

We specify p(F), p(T|F) and p(H|T) as tables (1 =yes, 2 =no):

(p.F <- parray("F", levels=2, values=c(.01,.99)))

## F
##   F1   F2 
## 0.01 0.99 
## attr(,"class")
## [1] "parray" "array"

(p.TgF <- parray(c("T","F"), levels=c(2,2), values=c(.95,.05, .001,.999)))

##     F
## T      F1    F2
##   T1 0.95 0.001
##   T2 0.05 0.999
## attr(,"class")
## [1] "parray" "array"

(p.HgT <- parray(c("H","T"), levels=c(2,2), values=c(.80,.20, .010,.990)))

##     T
## H     T1   T2
##   H1 0.8 0.01
##   H2 0.2 0.99
## attr(,"class")
## [1] "parray" "array"

Brute force computations

1) Calculate joint distribution p(FTH)

p.FT  <- tableMult(p.F, p.TgF)
p.FTH <- tableMult(p.FT, p.HgT)
as.data.frame.table(p.FTH)

##    H  T  F      Freq
## 1 H1 T1 F1 0.0076000
## 2 H2 T1 F1 0.0019000
## 3 H1 T2 F1 0.0000050
## 4 H2 T2 F1 0.0004950
## 5 H1 T1 F2 0.0007920
## 6 H2 T1 F2 0.0001980
## 7 H1 T2 F2 0.0098901
## 8 H2 T2 F2 0.9791199

2) Calculate the marginal distribution p(FH)

p.FH <- tableMargin(p.FTH, margin=c('F','H'))
as.data.frame.table(p.FH)

##    F  H      Freq
## 1 F1 H1 0.0076050
## 2 F2 H1 0.0106821
## 3 F1 H2 0.0023950
## 4 F2 H2 0.9793179

3) calculate conditional distribution p(F| H)

p.H <- tableMargin(p.FH, margin='H')
(p.FgH <- tableDiv(p.FH, p.H))

##     F
## H             F1        F2
##   H1 0.415866923 0.5841331
##   H2 0.002439613 0.9975604

So p(F = 1(yes)|H = 1(yes)) = 0.42 while p(F = 1(yes)) = 0.01.

Brute force computations will fail

However, this scheme is computationally prohibitive for large models:

If the model has 80 variables each with 10 levels, the joint distribution will have \(10^{80}\) states = the estimated number of atoms in the universe!
In practice we a never interested in the joint distribution itself. We typically want the conditional distribution of one (or a few) variables given some of the other variables.
We want to obtain this without calculating the joint distribution…

Decomposable graphs and junction trees

Decomposable graphs

par(mfrow=c(1,3))
plot(ug(~1:2+2:3:4+3:4:5:6+6:7), "circo")  # decomposable
plot(ug(~1:2+2:3+3:4:5+4:1),"circo")       # not decomposable
plot(ug(~1:2:5+2:3:5+3:4:5+4:1:5),"circo") # not decomposable

Perfect ordering and maximum cardinality search

An undirected graph is triangulated (or chordal) if it has no cycles of length ≥ 4 without a chord. This is equivalent to that the vertices can be given a perfe ct ordering. A perfect ordering (if it exists) can be obtained with Maximum Cardinality Search. If character(0) is returned the graph is not triangulated. Otherwise a perfect ordering of the nodes is returned.

uG11<-ug(~a:b+b:c:d)
uG11m<-ug(~a:b+b:c:d,result="matrix") #matrix represeantation

uG12<-ug(c("a","b"),c("b","c","d") )
uG13<-ug(list(c("a","b"),c("b","c","d") ) )
mcs(uG11)

## [1] "a" "b" "c" "d"

mcs(uG11m)

## [1] "a" "b" "c" "d"

In some applications it is convenient to retain control over the orderi ng (if it exists). For example:

mcs(uG11,root=c("a","c"))

## [1] "a" "b" "c" "d"

The desired ordering (specified by root) is followed as far as possible (here only to the first variable “a”). Notice the output when applying mcs() to a directed graph:

daG11<-dag(~a:b+b:c:d)
daG12<-dag(c("a","b"),c("b","c","d") )
daG13<-dag(list(c("a","b"),c("b","c","d") ) )
mcs(daG11)

## character(0)

Task: play with uG12, uG13, daG12, daG13

Back to triangulation: Any undirected graph can be triangulated by adding edges to the graph, so called fill-ins:

uG<-ug(~a:b:c+c:d+d:e+a:e+f:g )
mcs(uG)

## character(0)

(tuG<-triangulate(uG))

## A graphNEL graph with undirected edges
## Number of Nodes = 7 
## Number of Edges = 8

mcs(tuG)

## [1] "a" "b" "c" "e" "d" "f" "g"

par(mfrow=c(1,2));plot(uG);plot(tuG)

Message passing

First, change the DAG into UG

ugFTH = moralize(FTH) 

par(mfrow=c(1,2), oma=c(2,1,2,1))
plot(FTH)
plot(ugFTH)

(qFT <- tableMult(p.F, p.TgF))

##     F
## T        F1      F2
##   T1 0.0095 0.00099
##   T2 0.0005 0.98901

(qTH <- p.HgT)

##     T
## H     T1   T2
##   H1 0.8 0.01
##   H2 0.2 0.99
## attr(,"class")
## [1] "parray" "array"

(qT <- parray("T",levels=2, values=1))

## T
## T1 T2 
##  1  1 
## attr(,"class")
## [1] "parray" "array"

(qTs <- tableMargin(qFT, "T"))

## T
##      T1      T2 
## 0.01049 0.98951

(qTHs <- tableMult(qTH, tableDiv(qTs, qT)))

##     H
## T           H1        H2
##   T1 0.0083920 0.0020980
##   T2 0.0098951 0.9796149

(qTss <- tableMargin(qTHs, "T"))

## T
##      T1      T2 
## 0.01049 0.98951

(qFTs <- tableMult(qFT, tableDiv(qTss, qTs)))

##     F
## T        F1      F2
##   T1 0.0095 0.00099
##   T2 0.0005 0.98901

This leaves us with marginal distributions on all cliques and separators

qFTs

##     F
## T        F1      F2
##   T1 0.0095 0.00099
##   T2 0.0005 0.98901

qTHs

##     H
## T           H1        H2
##   T1 0.0083920 0.0020980
##   T2 0.0098951 0.9796149

qTs

## T
##      T1      T2 
## 0.01049 0.98951

From this we can get:

qTs # probability of temperature

## T
##      T1      T2 
## 0.01049 0.98951

tableMargin(qFT, "F") # probability of fever

## F
##   F1   F2 
## 0.01 0.99

tableMargin(qTH, "H") # probability of headache

## H
##   H1   H2 
## 0.81 1.19

– and we never need to calculate the joint distribution!

Propagating findings - an example

qTH

##     T
## H     T1   T2
##   H1 0.8 0.01
##   H2 0.2 0.99
## attr(,"class")
## [1] "parray" "array"

## Now important part: set finding H=H1
qTH[c(2,4)] <- 0
qTH

##     T
## H     T1   T2
##   H1 0.8 0.01
##   H2 0.0 0.00
## attr(,"class")
## [1] "parray" "array"

## Repeat everything
(qTs <- tableMargin(qFT, "T"))

## T
##      T1      T2 
## 0.01049 0.98951

(qTHs <- tableMult(qTH, tableDiv(qTs, qT)))

##     H
## T           H1 H2
##   T1 0.0083920  0
##   T2 0.0098951  0

(qTss <- tableMargin(qTHs, "T"))

## T
##        T1        T2 
## 0.0083920 0.0098951

(qFTs <- tableMult(qFT, tableDiv(qTss, qTs)))

##     F
## T         F1        F2
##   T1 7.6e-03 0.0007920
##   T2 5.0e-06 0.0098901

sum(qFTs)

## [1] 0.0182871

To get probability of fever we must normalize:

tableMargin(qFTs, "F")/sum(qFTs)

## F
##        F1        F2 
## 0.4158669 0.5841331

The important point of the computations: After working inwards and outwards in the junction tree, the clique potentials are consistent: They match on their separators:

tableMargin(qFTs, "T")

## T
##        T1        T2 
## 0.0083920 0.0098951

tableMargin(qTHs, "T")

## T
##        T1        T2 
## 0.0083920 0.0098951

qTss

## T
##        T1        T2 
## 0.0083920 0.0098951

The chest clinic narrative

Let’s consider (again) the following model:

This is example based on Lauritzen and Spiegehalter (1988) where:

“Shortness–of–breath (dyspnoea) may be due to tuberculosis, lung cancer or bronchitis, or none of them, or more than one of them.
A recent visit to Asia increases the chances of tuberculosis, while smoking is known to be a risk factor for both lung cancer and bronchitis.
The results of a single chest X–ray do not discriminate between lung cancer and tuberculosis, as neither does the presence or absence of dyspnoea.”

Specify chest clinic network (but this times with some probabilities)

yn <- c("yes","no")
a    <- cptable(~asia, values=c(1,99),levels=yn)
t.a  <- cptable(~tub+asia, values=c(5,95,1,99),levels=yn)
s    <- cptable(~smoke, values=c(5,5), levels=yn)
l.s  <- cptable(~lung+smoke, values=c(1,9,1,99), levels=yn)
b.s  <- cptable(~bronc+smoke, values=c(6,4,3,7), levels=yn)
e.lt <- cptable(~either+lung+tub,values=c(1,0,1,0,1,0,0,1),levels=yn)
x.e  <- cptable(~xray+either, values=c(98,2,5,95), levels=yn)
d.be <- cptable(~dysp+bronc+either, values=c(9,1,7,3,8,2,1,9), levels=yn)
plist <- compileCPT(list(a, t.a, s, l.s, b.s, e.lt, x.e, d.be))
bnet    <- grain(plist)
bnet

## Independence network: Compiled: FALSE Propagated: FALSE 
##   Nodes: chr [1:8] "asia" "tub" "smoke" "lung" "bronc" "either" "xray" "dysp"

plist  #list all

## CPTspec with probabilities:
##  P( asia )
##  P( tub | asia )
##  P( smoke )
##  P( lung | smoke )
##  P( bronc | smoke )
##  P( either | lung tub )
##  P( xray | either )
##  P( dysp | bronc either )

plist$tub   #list specific to 'tub'

##      asia
## tub    yes   no
##   yes 0.05 0.01
##   no  0.95 0.99
## attr(,"class")
## [1] "parray" "array"

plot(bnet)

Queries, setting findings and probability of findings

Now when we have the basic knowlage about graphs and some (usefull) DAG we can ask questions:

querygrain(bnet, nodes=c('lung','tub','bronc'))

## $tub
## tub
##    yes     no 
## 0.0104 0.9896 
## 
## $lung
## lung
##   yes    no 
## 0.055 0.945 
## 
## $bronc
## bronc
##  yes   no 
## 0.45 0.55

bnet.f <- setFinding(bnet, nodes=c('asia','dysp'), state=c('yes','yes'))
bnet.f

## Independence network: Compiled: TRUE Propagated: TRUE 
##   Nodes: chr [1:8] "asia" "tub" "smoke" "lung" "bronc" "either" "xray" "dysp"
##   Evidence:
##   nodes is.hard.evidence hard.state
## 1  asia             TRUE        yes
## 2  dysp             TRUE        yes
##   pEvidence: 0.004501

pFinding(bnet.f)

## [1] 0.004501375

querygrain(bnet.f, nodes=c('lung','tub','bronc'))

## $tub
## tub
##        yes         no 
## 0.08775096 0.91224904 
## 
## $lung
## lung
##        yes         no 
## 0.09952515 0.90047485 
## 
## $bronc
## bronc
##       yes        no 
## 0.8114021 0.1885979

querygrain(bnet.f, nodes=c('lung','tub','bronc'), type='joint')

## , , bronc = yes
## 
##      lung
## tub           yes         no
##   yes 0.003149038 0.04183722
##   no  0.059831718 0.70658410
## 
## , , bronc = no
## 
##      lung
## tub           yes         no
##   yes 0.001827219 0.04093749
##   no  0.034717170 0.11111605
## 
## attr(,"class")
## [1] "parray" "array"

Dependence graph, moralization and triangulation

The computational scheme outlined above does not apply directly to the chest clinic example.

An extra step is needed: Triangulation of the moral graph (but we aready know how to do it).

par(mfrow=c(1,3))
plot(bnet$dag, main = 'DAG')
plot(moralize(bnet$dag), main = 'Moralized')
plot(triangulate(moralize(bnet$dag)), main = 'Triangulated')

Notice: We have not changed the fundamental model by these steps, but some conditional independencies are concealed in the triangulated graph. But the triangulated graph factorization allows efficient calculations.

Fundamental operations in `gRain`

Network specification: grain() Create a network from list of conditional probability tables; and do a few other things.
Set findings: setFinding(): Set the values of some variables.
Ask queries: querygrain(): Get updated beliefs (conditional probabilities given findings) of some variables

Under the hood there are two additional operations:

Compilation: compile() Create a clique potential representation (and a few other steps)
Propagation: propagate() Turn clique potentials into clique marginals.

Final exercise

Task: Make some new queries, set some findings and probability of findings for chest clicic DAG.

Lab11

Lukasz Kozlowski Lab11

10th Jan 2019

Graphical Models - part 2

Bayesian Network (BN) basics

Exercise 1

Specification of conditional probability tables

Brute force computations

1) Calculate joint distribution p(FTH)

2) Calculate the marginal distribution p(FH)

3) calculate conditional distribution p(F| H)

Brute force computations will fail

Decomposable graphs and junction trees

Message passing

Propagating findings - an example

The chest clinic narrative

Queries, setting findings and probability of findings

Dependence graph, moralization and triangulation

Fundamental operations in `gRain`

Final exercise

Lab11

Lukasz Kozlowski Lab11

10th Jan 2019

Graphical Models - part 2

Bayesian Network (BN) basics

Exercise 1

Specification of conditional probability tables

Brute force computations

1) Calculate joint distribution p(FTH)

2) Calculate the marginal distribution p(FH)

3) calculate conditional distribution p(F| H)

Brute force computations will fail

Decomposable graphs and junction trees

Message passing

Propagating findings - an example

The chest clinic narrative

Queries, setting findings and probability of findings

Dependence graph, moralization and triangulation

Fundamental operations in gRain

Final exercise

Fundamental operations in `gRain`