# Support for Graphical Modelling in Bayesian Network Knowledge Engineering: a visual Tool for Domain Experts

 Название Support for Graphical Modelling in Bayesian Network Knowledge Engineering: a visual Tool for Domain Experts страница 7/35 Дата 05.10.2012 Размер 0.88 Mb. Тип Документы

## 2.2Bayesian Networks - Directed graphs

There are different ways of viewing a BN. From a syntax perspective, a BN is comprised of nodes, arcs, and parameters. From a semantics perspective, there are two equivalent ways of understanding a BN. One way is that it is a compact representation of a JPD. The other is that it is a representation of dependency-independency assumptions of a JPD. In this section I focus on the latter because this thesis is about the structural aspect of the model. I also present the theorem that ties the different ways of understanding BNs for the purpose of completeness.

### 2.2.1BN as a representation of dependency independency assumptions

If the nodes in a DAG correspond with a set of random variables U then the DAG is called a Bayesian Network (BN) Structure.

The criterion

Definition: Given a graph D=(U,E), X,Y nodes in U and a set of nodes Z in U, a trail from node X to Y is blocked by Z, if one of the following conditions is true:

1. There is a node Z that does not have head-to-head structure on the trail and Z in Z.

2. There is a node W that has a head-to-head structure on the trail and W is not in Z and neither are any of W’s descendents.

Otherwise it is said to be activated by Z.

The separation criterion for DAGs is a direction-dependent criterion called d-separation. X and Y are d-separated by Z in a DAG D if every trail from X to Y in D is blocked. (This is denoted as d-sepD(X,Y|Z)). The same is defined for a disjoint sets of nodes X, Y, Z. If for every node X in X and every node Y in Y all trails between the two nodes are blocked by Z, then X and Y are d-separated by Z (and is denoted as d-sepD(X,Y|Z)) [65] (p.203), see figure 2.2.

Figure 2.3 d-separation.

The trails in a, b and c are blocked because condition 1 is true i.e. there is a node Z on the trail from X to Y such that Z does not have head-to-head structure and Z in Z. The trail in d is blocked because condition 2 is true i.e. there is a node W on the trail from X to Y such that W has a head-to-head structure, W is not in Z and neither are any of W’s descendents. Note that X and Y can be sets of nodes [82].

Markov Condition

Given any DAG D, according to the d-separation criterion, every node X in D and its non-descendents nodes are d-separated by the parents of X. (that is, for every X d-sepD(X,NonDescendent(X)|Parents(X)) is true). This is called the Markov Condition and denoted Markov(D).

d-separation and conditional independency assumptions

Definition: Given a JPD F of U and a DAG D=(U,E) then D is a Bayesian Network of F if every d-separation condition displayed in D corresponds to a valid conditional independence relationship in F. In other words, for every disjoint sets of nodes X, Y, Z, if X is d-separated from Y by Z in D then X is conditionally independent of Y given Z in F. (d-sepD(X,Y|Z) IndF(X,Y|Z))

The d-separation criterion for reading conditional independency assumptions from the graph topology enables the representation of the following relationships:

• Independent: A and B are not connected by any trail in the graph, A and B do not influence each other.

• Dependent: A → B, A and B can directly influence each other (A influences B and according to Bayes’ Rule B influences A .

• Conditionally independent: A → C → B, A and B can influence each other through C. They become uninfluential on each other if C is known

• Marginally independent A → C ← B, A and B do not influence each other if nothing is known. They can become influential on each other if C or one of its descendents is known. This is a dependency relationship among variables, which is induced by information.

A special d-separating set is the Markov Blanket. The Markov blanket of each node in the graph is comprised of all the parents of the node, all the children of the node and all the other parents of all the children. Once the Markov blanket of a node is known, this node becomes d-separated from the rest of the nodes in the graph [50](p.14).

Initial assumptions

Theorem: Given a JPD F of U and a DAG D=(U,E) then D is a Bayesian Network of F if Markov(D) holds in F, that is X is conditionally independent of all its non-descendents if the parents of X are known.

In other words, given a BN structure D that holds the Markov(D) with respect to some JPD F, for every disjoint sets X, Y and Z, if X and Y are d-separated by Z in D then X and Y are conditionally independent given Z in F. (d-sepD(X,Y|Z) IndF(X,Y|Z)).

As in the undirected graphs representation, this theorem provides an algorithm for constructing a BN Structure for a given JPD F considering only an initial set of assumptions. It also ensures that all d-separation relationships that can be read by the d-separation criterion represent valid conditional independency assumptions in F.

The DAG structure provides an elegant way of representing a set of initial independence-assumptions. The d-separation criterion, together with some efficient algorithms, provided the means to find all independencies that can evolve from the graph topology or to check if a given independency exists. However, it is important to note that in some distributions (even simple ones) not all independencies can be represented by a DAG [72](p.126). Nevertheless, the main use of the DAG, from a computational perspective, is to enable exploiting as many independencies as possible in order to make the inference task more efficient. The fact that some independencies are missed only means that in these cases the computation process might be less efficient. However, since all independencies that are in the DAG are also in F the computation process will always be correct.

### 2.2.2Parameters Considerations (The Quantitative Part)

The quantitative part is comprised of a conditional probability table (CPT) attached to each node in the DAG. This table specifies the conditional probabilities of the states of the node given all possible combinations of the states of its parents. These tables encode the strength of the connection between the variable represented by the node and the variables that are represented by its parents in the graph. If the node has no parents, the CPT contains the unconditioned probabilities of the states of the node.

### 2.2.3BN as a compact representation of a JPD

Theorem: Let U be a set of variables, a JPD F on U and a DAG D =(U,E) then D is a Bayesian network of F iff

For



Equation (7) provides a factorised and compact way for representing the JPD, using the attached CPTs.

For the purpose of this thesis BNs are described herein as a representation of dependency -independency assumptions. However, for other purposes it is more intuitive to describe it as a compact representation of a JPD. For this reason BNs are defined in most of the literature as follows:

A BN is a DAG in which the following statements are fulfilled:

[50, 74, 82] ((p.18), (p.364), (p.436))

• The nodes of the graph represent random variables.

• Each variable has a finite set of mutually exclusive states.

• A set of directed arcs connects pairs of nodes.

• Each node has a CPT attached to it.

Under these conditions every entry in the JPD can be calculated, i.e. any probabilistic reasoning can be answered using equation (7).

### 2.2.4Reasoning

The nodes in the model can be logically divided into query nodes and observation nodes. We know nothing about the state of the query nodes but we want to reason about them. The nodes we can observe are known after being observed. The inference then updates the probabilities of the query nodes in light of the information about the known nodes. Thus, together with inference algorithms, a BN enables the following patterns of reasoning that can be explained in causal terms or graphical terms:

Diagnostic reasoning – Information is flowing from effects to causes or from children to parents.

Causal reasoning (or prediction) – Information is flowing from causes to effects or from ‘parents’ to ‘children’.

Intercausal reasoning (also called explaining away) – information is flowing between causes of a common effect or between ‘parents’ of the same ‘child’.

Mixed reasoning – Information is flowing between nodes combining the above structures.

Figure 2.4 Different reasoning patterns.

Simple structures that demonstrate the different patterns of reasoning that are possible in a BN [82].

### 2.2.5Node elimination algorithm

The definition of Markov(D) assumptions to hold in F provides a simple procedure to construct a BN structure, given a JPD F over a set of variables U. The variables in U are set in the order X1,X2,…,Xn. The first node X1 is chosen. The second node X2 is then chosen. If X2 depends on X1 according to the JPD, then an arc from X1 to X2 is drawn. The process continues this way; the Xi node is chosen and each of the nodes X1,…,Xi-1 is examined to see if Xi depends on it. An arc is drawn from each node on which Xi is found to be dependent, to the Xi node. The result is a BN for the given JPD. This process is relevant if the JPD is given.

In real situations the JPD is unknown and dependencies and independencies are subject to human judgment. In such situations, the procedure described above is still valid, but the answer to the question whether Xi depends on any of the nodes that had been previously added to the network depends on an expert’s judgment and on the added nodes. In any case, because arcs are drawn from existing nodes to the one added, the procedure for constructing a BN as described above is heavily dependent on the ordering of the variables and the resultant structure can be sparse for a good ordering and highly connected for a different ordering, e.g. [82](p.442).

Since, for an ‘n’ number of variables there are n! possible orderings, one would need to explore all of them in order to determine which structure is the best. Naturally, this is not feasible. Therefore, other approaches to the construction of BNs had to be sought.

## Похожие:

Разместите кнопку на своём сайте:
Библиотека

База данных защищена авторским правом ©lib.znate.ru 2014
обратиться к администрации
Библиотека