Following the same logic depicted in the formula, if there are only friendly starships nearby with the ability to harm the *Enterprise*, then the distribution becomes [0, 0, 0.01, .99]. The last line indicates that if that no starship can harm the *Enterprise*, then the danger level will be *Low* for sure. As noted previously, a powerful formalism is needed to represent complex scenarios at a reasonable level of fidelity. In the probability distribution shown in this example, additional detail could have been added and many nuances might have been explored. For example, a large number of nearby Romulan ships might have been considered as a fair indication of a coordinated attack and therefore implied greater danger than an isolated Cardassian ship. Nonetheless, this example was purposely kept simple in order to clarify the basic capabilities of the logic. It is clear that more complex knowledge patterns could be accommodated as needed to suit the requirements of the application. MEBN logic has built-in logical MFrags that provide the ability to express any sentence that can be expressed in first-order logic. Laskey (2005) proves that MEBN logic can implicitly express a probability distribution over interpretations of any consistent, finitely axiomatizable first-order theory. This provides MEBN with sufficient expressive power to represent virtually any scientific hypothesis. 1.12Representing Recursion in MEBN Logic One of the main limitations of BNs is their lack of support for recursion. Extensions such as dynamic Bayesian networks provide the ability to define certain kinds of recursive relationships. MEBN provides theoretically grounded support for very general recursive definitions of local distributions. Figure 10 depicts an example of how an MFrag can represent temporal recursion. The Zone MFrag In that MFrag, a careful reading of the context nodes will make it clear that in order for the local distribution to apply, *z* has to be a zone and *st* has to be a starship that has *z* as its current position. In addition, *tprev* and *t* must be *TimeStep* entities, and *tprev* is the step preceding *t*. Other varieties of recursion can also be represented in MEBN logic by means of MFrags that allow influences between instances of the same random variable. Allowable recursive definitions must ensure that no random variable instance can influence its own probability distribution. General conditions that both recursive and non-recursive MFrags and MTheories must satisfy are given in Laskey (2005). As in non-recursive MFrags, the input nodes in a recursive MFrag include nodes whose local distributions are defined in another MFrag (i.e., *CloakMode*(*st*)). In addition, the input nodes may include instances of recursively-defined nodes in the MFrag itself. For example, the input node *ZoneMD*(*z*, *tprev*) represents the magnetic disturbance in zone *z* at the previous time step, which influences the current magnetic disturbance *ZoneMD*(*z*, *t*). The recursion is grounded by specifying an initial distribution at time !*T*0 that does not depend on a previous magnetic disturbance. Figure 12 illustrates how recursive definitions can be applied to construct a *situation-specific Bayesian Network* (SSBN) to answer a query. In this specific case, the query concerns the magnetic disturbance at time !*T*3 in zone !*Z*0, where !*Z*0 is known to contain the uncloaked starship !*ST*0 (*Enterprise*) and exactly one other starship !*ST*1, which is known to be cloaked. SSBN Constructed from Zone MFrag The process to build the graph shown in this picture begins by creating an instance of the home MFrag of the query node *ZoneMD*(!*Z*0,!*T*3). That is, !*Z*0 is substituted for *z* and !*T*3 for *t*, and then all instances of the remaining random variables that meet the context constraints are created. The next step is to build any conditional probability tables (CPTs) that can already be built on the basis of the available data. CPTs for *ZoneMD*(!*Z*0,!*T*3), *ZoneNature*(!*Z*0), *ZoneEShips*(!*Z*0), and *ZoneFShips*(!*Z*0) can be constructed because they are resident in the retrieved MFrag. Single-valued CPTs for *CloakMode*(!*ST*0), *CloakMode*(!*ST1*), and !*T*3=!*T*0 can be specified because the values of these random variables are known. At end of the above process, only one node, *ZoneMD*(!*Z*0,!*T*2), remains for which there is no CPT. To construct its CPT, its home MFrag must be retrieved, and any random variables that meet its context constraints and have not already been instantiated must be instantiated. The new random variables created in this step are *ZoneMD*(!*Z*0,!*T*1) and !*T*2=!*T*0. The value of the latter is already known, while the home MFrag of the former has to be retrieved. This process continues until all the nodes of Figure 11 are added. At this point, the CPTs for all random variables can be constructed, and thus the SSBN is complete.^{20} The MFrag depicted in Figure 10 defines the local distribution that applies to all these instances, even though for brevity only the probability distributions (local and default) for node *ZoneMD*(z, *t*) were displayed. The remaining distributions can be found in Appendix A. Note that when there is no starship with cloak mode activated, the probability distribution for magnetic disturbance given the zone nature does not change with time. When there is at least one starship with cloak mode activated, then the magnetic disturbance tends to fluctuate regularly with time in the manner described by the local expression. For the sake of simplicity, the underlying assumption that the local distribution depends only on whether there is a cloaked starship nearby was adopted, although in a more “realistic” model the disturbance might increase with the number of cloaked starships and/or the power of the cloaking device. Another implicit assumption taken in this example regards the initial distribution for the magnetic disturbance when there are cloaked starships, which was assumed to be equal to the stationary distribution given the zone nature and the number of cloaked starships present initially. Of course, it would be possible to write different local expressions expressing a dependence on the number of starships, their size, their distance from the *Enterprise*, etc. MFrags provide a flexible means to represent knowledge about specific subjects within the domain of discourse, but the true gain in expressive power is revealed when these “knowledge patterns” are aggregated in order to form a coherent model of the domain of discourse that can be instantiated to reason about specific situations and refined through learning. It is important to note that just collecting a set MFrags that represent specific parts of a domain is not enough to ensure a coherent representation of that domain. For example, it would be easy to specify a set of MFrags with cyclic influences, or one having multiple conflicting distributions for a random variable in different MFrags. The following section describes how to define complete and coherent domain models as collections of MFrags. 1.13Building MEBN Models with MTheories In order to build a coherent model, it is mandatory that the set of MFrags collectively satisfies consistency constraints ensuring the existence of a unique joint probability distribution over instances of the random variables mentioned in the MFrags. Such a coherent collection of MFrags is called an MTheory. An MTheory represents a joint probability distribution for an unbounded, possibly infinite number of instances of its random variables. This joint distribution is specified by the local and default distributions within each MFrag together with the conditional independence relationships implied by the fragment graphs. The MFrags described above are part of a *generative MTheory* for the intergalactic conflict domain. A generative MTheory summarizes statistical regularities that characterize a domain. These regularities are captured and encoded in a knowledge base using some combination of expert judgment and learning from observation. To apply a generative MTheory to reason about particular scenarios, it is necessary to provide the system with specific information about the individual entity instances involved in the scenario. On receipt of this information, Bayesian inference can be used both to answer specific questions of interest (e.g., how high is the current level of danger to the *Enterprise*?) and to refine the MTheory (e.g., each encounter with a new species gives additional statistical data about the level of danger to the *Enterprise* from a starship operated by an unknown species). Bayesian inference is used to perform both problem-specific inference and learning in a sound, logically coherent manner.
*Findings* are the basic mechanism for incorporating observations into MTheories. A finding is represented as a special 2-node MFrag containing a node from the generative MTheory and a node declaring one of its states to have a given value. From a logical point of view, inserting a finding into an MTheory corresponds to asserting a new axiom in a first-order theory. In other words, MEBN logic is inherently open, having the ability to incorporate new axioms as evidence and update the probabilities of all random variables in a logically consistent way. In addition to the requirement that each random variable must have a unique home MFrag, a valid MTheory must ensure that all recursive definitions terminate in finitely many steps and contain no circular influences. Finally, as demonstrated above, random variable instances may have a large, and possibly unbounded number of parents. A valid MTheory must satisfy an additional condition to ensure that the local distributions have reasonable limiting behavior as more and more parents are added. Laskey (2005) proved that when an MTheory satisfies these conditions (as well as other technical conditions that are unimportant to our example), then there exists a joint probability distribution on the set of instances of its random variables that is consistent with the local distributions assigned within its MFrags. Furthermore, any consistent, finitely axiomatizable FOL theory can be translated to infinitely many MTheories, all having the same purely logical consequences, that assign different probabilities to statements whose truth-value is not determined by the axioms of the FOL theory. MEBN logic contains a set of built-in logical MFrags (including quantifier, indirect reference, and Boolean connective MFrags) that provide the ability to represent any sentence in first-order logic. If the MTheory satisfies additional conditions, then a conditional distribution exists given any finite sequence of findings that does not logically contradict the logical constraints of the generative MTheory. MEBN logic thus provides a logical foundation for systems that reason in an open world and incorporate observed evidence in a mathematically sound, logically coherent manner. Figure 12 shows an example of a generative MTheory for the *Star Trek* domain. For the sake of conciseness, the local distribution formulas and the default distributions are not shown here. The Star Trek Generative MTheory The Entity Type MFrag, at the right side of Figure 12, is meant to formally declare the possible types of entity that can be found in the model. This is a generic MFrag that allows the creation of domain-oriented types (which are represented by TypeLabel entities). This MFrag forms the basis for a Type system. The simple model depicted here did not address the creation or the explicit support for entity types. Standard MEBN logic as defined in Laskey (2005) is untyped, meaning that a knowledge engineer who wishes to represent types must explicitly define the necessary logical machinery. Typing is an important feature in ontology languages such as OWL, so as part of this research effort we have developed an extended version of the MEBN logic that includes built-in support for typing. This extension is explained in Chapter 4. The Entity Type MFrag of Figure 12 defines an extremely simple kind of type structure. MEBN can be extended with MFrags to accommodate any flavor of type system, including more complex capabilities such as sub-typing, polymorphism, multiple-inheritance, etc. It is important to understand the power and flexibility that MEBN logic gives to knowledge base designers by allowing multiple, equivalent ways of portraying the same knowledge. Indeed, the generative MTheory of Figure 12 is just one of the many possible (consistent) sets of MFrags that can be used to represent a given joint distribution. There, the random variables were clustered in a way that attempts to naturally reflect the structure of the objects in that scenario (i.e. an object oriented approach to modeling was taken), but this was only one design option among the many allowed by the logic. As an example of such flexibility, Figure 13 depicts the same knowledge contained in the Starship MFrag of Figure 12 (right side) using three different MFrags. In this case, the modeler might have opted for decomposing an MFrag in order to get the extra flexibility of smaller, more specific MFrags that can be combined in different ways. Another knowledge engineer might prefer the more concise approach of having all knowledge in just one MFrag. Ultimately, the approach to be taken when building an MTheory will depend on many factors, including the model’s purpose, the background and preferences of the model’s stakeholders, the need to interface with external systems, etc. Equivalent MFrag Representations of Knowledge First Order Logic (or one of its subsets) provides the theoretical foundation for the type systems used in popular object-oriented and relational languages. MEBN logic provides the basis for extending the capability of these systems by introducing a sound mathematical basis for representing and reasoning under uncertainty, which is precisely the idea being explored in the extensions that will be proposed in the next Chapter. The advantages of a MEBN-based type system are also explored in that Chapter. Another powerful aspect of MEBN, the ability to support *finite or countably infinite recursion*, is illustrated in the Sensor Report and Zone MFrags, both of which involve temporal recursion. The Time Step MFrag includes a formal specification of the local distribution for the initial step of the time recursion (i.e. when *t*=!*T*0) and of its recursive steps (i.e. when *t* does not refer to the initial step). Other kinds of recursion can be represented in a similar manner. MEBN logic also has the ability to represent and reason about hypothetical entities. Uncertainty about whether a hypothesized entity actually exists is called *existence uncertainty*. In the example model presented here, the random variable *Exists*(*st*) is used to reason about whether its argument is an actual starship. For example, it might be uncertain whether a sensor report corresponds to one of the starships already known by the system, a starship of which the system was nit previously aware of, or a spurious sensor report. To allow for hypothetical starships, the local distribution for *Exists*(*st*) assigns non-zero probability to *False*. Suppose the unique identifier !*ST*4 refers to a hypothetical starship nominated to explain the report. In this case, *IsA*(*Starship*, !*ST*4) has value *True*, but the value of *Exists*(!*ST*4) is uncertain. A value of *False* would mean !*ST*4 is a spurious starship or false alarm. Queries involving the unique identifier of a hypothetical starship return results weighted by our belief that it is an actual or a spurious starship. Belief in *Exists*(!*ST*4) is updated by Bayesian conditioning as relevant evidence accrues. Representing existence uncertainty is particularly useful for counterfactual reasoning and reasoning about causality (Druzdzel & Simon, 1993; Pearl, 2000). Because the *Star Trek* model was designed to demonstrate the capabilities of MEBN logic, the approach taken was to avoid issues that could be handled by the logic but would make the model too complex. As an example, one aspect that this model does not consider is *association uncertainty*, a very common problem in multi-sensor data fusion systems. Association uncertainty means we are not sure about the source of a given report. For example, we may receive a report, !*SR*4, indicating a starship near a given location. Suppose we cannot tell whether the report was generated by !*ST*1 or !*ST*3, two starships known to be near the reported location, or by a previously unreported starship !*ST*4. In this case, we would enumerate these three unique identifiers as possible values for *Subject*(!*SR*4), and specify that *Exists*(!*ST*4) has value *False* if *Subject*(!*SR*4) has any value other than !*ST*4. Many weakly discriminatory reports coming from possibly many starships produces an exponential set of combinations that require special *hypothesis management* methods (Stone* et al.*, 1999). Closely related to association uncertainty is *identity uncertainty*, or uncertainty about whether two expressions refer to the same entity. Association uncertainty can be regarded as a special case of identity uncertainty – that is, uncertainty about the identity of *Subject*(!*SR*4). The ability to represent existence, association, and identity uncertainty provides a logical foundation for hypothesis management in multi-source fusion. The *Star Trek* model was built in a way to avoid these problems by assuming that the *Enterprise*’s sensor suite can achieve perfect discrimination. However, the underlying logic can represent and reason with association, existence, and type uncertainty, and thus provides a sound logical foundation for hypothesis management in multi-source fusion. 1.14Making Decisions with Multi-Entity Decision Graphs. Captain Picard has more than an academic interest in the danger from nearby starships. He must make decisions with life and death consequences. Multi-Entity Decision Graphs (MEDGs, or “medges”) extend MEBN logic to support decision making under uncertainty. MEDGs are related to MEBNs in the same way influence diagrams are related to Bayesian Networks. A MEDG can be applied to any problem that involves optimal choice from a set of alternatives subject to given constraints. When a decision MFrag (i.e. one that has decision and utility nodes) is added to a generative MTheory such as the one portrayed in Figure 12, the result is a MEDG. As an example, Figure 14 depicts a decision MFrag representing Captain Picard’s choice of which defensive action to take. The decision node *DefenseAction*(*s*) represents the set of defensive actions available to the Captain (in this case, to fire the ship’s weapons, to retreat, or to do nothing). The value nodes capture Picard’s objectives, which in this case are to protect the *Enterprise* while also avoiding harm to innocent people as a consequence of his defensive actions. Both objectives depend upon Picard’s decision, while *ProtectSelf*(*s*) is influenced by the perceived danger to *Enterprise* and *ProtectOthers*(*s*) is depends on the level of danger to other starships in the vicinity. The Star Trek Decision MFrag The model described here is clearly an oversimplification of any “real” scenario a Captain would face. Its purpose is to convey the core idea of extending MEBN logic to support decision-making. Indeed, a more common situation is to have multiple, mutually influencing, often conflicting factors that together form a very complex decision problem, and require trading off different attributes of value. For example, a decision to attack would mean that little power would be left for the defense shields; a retreat would require aborting a very important mission. MEDGs provide the necessary foundation to address all the above issues. Readers familiar with influence diagrams will appreciate that the main concepts required for a first-order extension of decision theory are all present in Figure 14. In other words, MEDGs have the same core functionality and characteristics of common MFrags. Thus, the utility table in *Survivability*(*s*) refers to the entity whose unique identifier substitutes for the variable *s*, which according to the context nodes should be our own starship (*Enterprise* in this case). Likewise, the states of input node *DangerToSelf*(*s, t*) and the decision options listed in *DefenseAction*(*s*) should also refer to the same entity. Of course, this confers to MEDGs the expressive power of MEBN models, which includes the ability to use this same decision MFrag to model the decision process of the Captain of another starship. Notice that a MEDG Theory should also comply with the same consistency rules of standard MTheories, along with additional rules required for influence diagrams (e.g., value nodes are deterministic and must be leaf nodes or have only value nodes as children). In the present example, adding the Star Trek Decision MFrag of Figure 14 to the generative MTheory of Figure 12 will maintain the consistency of the latter, and therefore the result will be a valid generative MEDG Theory. That simple illustration can be extended to more elaborate decision constructions, providing the flexibility to model decision problems in many different applications spanning diverse domains. 1.15Inference in MEBN Logic. A generative MTheory provides prior knowledge that can be updated upon receipt of evidence represented as finding MFrags. We now describe the process used to obtain posterior knowledge from a generative MTheory and a set of findings. In a BN model such as the ones shown from Figure 5 through Figure 7, assessing the impact of new evidence involves conditioning on the values of evidence nodes and applying a belief propagation algorithm. When the algorithm terminates, beliefs of all nodes, including the node(s) of interest, reflect the impact of all evidence entered thus far. This process of entering evidence, propagating beliefs, and inspecting the posterior beliefs of one or more nodes of interest is called a query. MEBN inference works in a similar way (after all, MEBN is a Bayesian logic), but following a more complex yet more flexible process. Whereas BNs are static models that must be changed whenever the situation changes (e.g. number of starships, time recursion, etc.), an MTheory implicitly represents an infinity of possible scenarios. In other words, the MTheory represented in Figure 12 (as well as the MEDG obtained by aggregating the MFrag in Figure 14) is a model that can be used for as many starships as wanted, and for as many time steps that are necessary to get the conclusions needed. That said, the obvious question is how to perform queries within such a model. A simple example of query processing was given above in Section 3.3. Here, the general algorithm for constructing a situation-specific Bayesian network (SSBN) is described in a general way. In order to execute such algorithm, it is necessary to have an initial generative MTheory (or MEDG Theory), a Finding set (which conveys particular information about the situation) and a Target set (which indicates the nodes of interest to the query being made). For comparison, let’s suppose there is a situation similar to the one in Figure 3, where four starships are within the *Enterprise*’s range. In that particular case, a BN was used to represent the situation at hand, which means the model is “hardwired” to a known number (four) of starships, and any other number would require a different model. A standard Bayesian inference algorithm applied to that model would involve entering the available information about these four starships (i.e., the four sensor reports), propagating the beliefs, and obtaining posterior probabilities for the hypotheses of interest (e.g., the four *Starship Type* nodes). Similarly, MEBN inference begins when a query is posed to assess the degree of belief in a target random variable given a set of evidence random variables. We start with a generative MTheory, add a set of finding MFrags representing problem-specific information, and specify the target nodes for our query. The first step in MEBN inference is to construct the SSBN, which can be seen as an ordinary Bayesian network constructed by creating and combining instances of the MFrags in the generative MTheory. Next, a standard Bayesian network inference algorithm is applied. Finally, the answer to the query is obtained by inspecting the posterior probabilities of the target nodes. A MEBN inference algorithm is provided in Laskey (2005). The algorithm presented there does not handle decision graphs. Thus, the illustration presented in the following lines extends the algorithm for purposes of demonstrating how the MEDG Theory portrayed in Figure 12 and Figure 14 can be used to support the Captain’s decision. In this example, the finding MFrags convey information that there are five starships (!*ST*0 through !*ST*4) and that the first is *Enterprise* itself. For the sake of illustration, let’s assume that the Finding set also includes data regarding the nature of the space zone *Enterprise* is currently located (!*Z*0), its magnetic disturbance for the first time step (!*T*0), and sensor reports for starships !*SR*1 to !*SR*4 for the first two time steps. Let’s also assume that the Target set for this illustrative query includes an assessment of the level of danger experienced by the *Enterprise* and the best decision to take given this level of danger. Figure 15 shows the situation-specific Bayesian network for such query^{21}. To construct that SSBN, the initial step is to create instances of the random variables in the Target set and the random variables for which there are findings. The target random variables are *DangerLevel*(!*ST*0) and *DefenseAction*(!*ST*0). The finding random variables are the eight *SRDistance *nodes (2 time steps for each of four starships) and the two *ZoneMD* reports (one for each time step). Although each finding MFrag contains two nodes, the random variable on which there is a finding and a node indicating the value to which it is set, only the first of these is included in our situation-specific Bayesian network, and declared as evidence that its value is equal to the observed value indicated in the finding MFrag. Evidence nodes are shown with bold borders. SSBN for the Star Trek MTheory with Four Starships within Range The next step is to retrieve and instantiate the home MFrags of the finding and target random variables. When each MFrag is instantiated, instances of its random variables are created to represent known background information, observed evidence, and queries of interest to the decision maker. If there are any random variables with undefined distributions, then the algorithm proceeds by instantiating their respective home MFrags. The process of retrieving and instantiating MFrags continues until there are no remaining random variables having either undefined distributions or unknown values. The result, if this process terminates, is the *SSBN* or, in this example, a *situation-specific decision graph* (SSDG). In some cases the SSBN can be infinite, but under conditions given in Laskey (Laskey, 2005), the algorithm produces a sequence of approximate SSBNs for which the posterior distribution of the target nodes converges to their posterior distribution given the findings. Mahoney and Laskey (1998) define a SSBN as a minimal Bayesian network sufficient to compute the response to a query. A SSBN may contain any number of instances of each MFrag, depending on the number of entities and their interrelationships. The SSDG in Figure 15 is the result of applying this process to the MEDG Theory obtained with the aggregation of Figure 12 and Figure 14 with the Finding and Target set defined above. Another important use for the SSBN algorithm is to help in the task of performing Bayesian learning, which is treated in MEBN logic as a sequence of MTheories. 1.16Learning from Data. Learning graphical models from observations is usually divided into two different categories inferring the parameters of the local distributions when the structure is known, and inferring the structure itself. In MEBN, by structure we mean the possible values of the random variables, their organization into MFrags, the fragment graphs, and the functional forms of the local distributions. Figure 16 shows an example of parameter learning in MEBN logic in which we adopt the assumption that one can infer the length of a starship on the basis of the average length of all starships. This generic domain knowledge is captured by the generative MFrag, which specifies a prior distribution based on what we know about starship lengths. Parameter Learning in MEBN One strong point about using Bayesian models in general and MEBN logic in particular is the ability to refine prior knowledge as new information becomes available. In our example, let’s suppose that the Enterprise system receives precise information on the length of starships !*ST*2, !*ST*3, and !*ST*5; but has no information regarding the incoming starship !*ST*8. The first step of this simple parameter learning example is to enter the available information to the model in the form of findings (see box StarshipLenghInd Findings). Then, a query is posed on the length of !*ST*8. The SSBN algorithm will instantiate all the random variables that are related to the query at hand until it finishes with the SSBN depicted in Figure 16 (box SSBN with Findings). In this example, the MFrags satisfy graph-theoretic conditions under which a re-structuring operation called *finding absorption *(Buntine, 1994b) can be applied. Therefore, the prior distribution of the random variable *GlobalAvgLength* can be replaced in the SSBN by the posterior distribution obtained after adding evidence in the form of findings^{22}. As a result of this learning process, the probability distribution for *GlobalAvgLength* has been refined in light of the new information conveyed by the findings. The resulting, more precise distribution can now be used not only to predict the length of !*ST*8 but for future queries as well. In this specific example, the same query would retrieve the SSBN in the lower right corner of Figure 16 (box SSBN with Findings Absorbed). One of the major advantages of the finding absorption operation is that it greatly improves the tractability of both learning and SSBN inference. Finding absorption can also be applied to modify the generative MFrags themselves, thus creating a new generative MTheory that has the same conditional distribution given its findings as the original MTheory. In this new MTheory, the distribution of *GlobalAvgLength* has been modified to incorporate the observations and the finding random variables are set with probability 1 to their observed values. Restructuring MTheories via finding absorption can increase the efficiency of SSBN construction and of inference. Structure learning in MEBN works in a similar fashion. As an example, let’s suppose that when analyzing the data that was acquired in the parameter learning process above, a domain expert raises the hypothesis that the length of a given starship might depend on its class. To put it into a “real-life” perspective, let’s consider two classes: Explorers and Warbirds. The first usually are vessels crafted for long distance journeys with a relatively small crew and payload. Warbirds, on the other hand, are heavily armed vessels designed to be flagships of a combatant fleet, usually carrying lots of ammunition, equipped with many advanced technology systems and a large crew. Therefore, our expert thinks it likely that the average length of Warbirds may be greater than the average length of Explorers. In short, the general idea of this simple example is to mimic the more general situation in which we have a potential link between two attributes (i.e. starship length and class) but at best weak evidence to support the hypothesized correlation. This is a typical situation in which Bayesian models can use incoming data to learn both structure and parameters of a domain model. Generally speaking, the solution for this class of situations is to build two different structures and apply Bayesian inference to evaluate which structure is more consistent with the data as it becomes available. The initial setup of the structure learning process for this specific problem is depicted in Figure 17. Each of the two possible structures is represented by its own generative MFrag. The first MFrag is the same as before: the length of a starship depends only on a global average length that applies to starships of all classes. The upper left MFrag of Figure 17, StarshipLengthInd MFrag conveys this hypothesis. The second possible structure, represented by the ClassAvgLength and StarshipLengthDep MFrags, covers the case in which a starship class influences its length. The two structures are then connected by the Starship Length MFrag, which has the format of a *multiplexor* MFrag. The distribution of a multiplexor node such as *StarshipLength*(*st*) always has one parent *selector* node defining which of the other parents is influencing the distribution in a given situation. Structure Learning in MEBN In this example, where there are only two possible structures, the selector parent will be a two-state node. Here, the selector parent is the Boolean *LengthDependsOnClass*(!*Starship*). When this node has value *False* then *StarshipLength*(*cl*) will be equal to *StarshipLengthInd*(*st*), the distribution of which does not depend on the starship’s class. Conversely, if the selector parent has value *True* then *StarshipLength*(*cl*) will be equal to *StarshipLengthDep*(*st*), which is directly influenced by *ClassAvgLength*(*StarshipClass*(*st*)). Figure 18 shows the result of applying the SSBN algorithm to the generative MFrags in Figure 17. The SSBN on the left does not have the findings included, but only information about the existence of four starships. It can be noted that the prior chosen for the selector parent (the Boolean node on the top of the SSBN) was the uniform distribution, which means that both structures (i.e. class affecting length or not) have the same prior probability. The SSBN in the right side considers the known facts that !*ST*2 and !*ST*3 belong to the class of starships !*Explorer*, and that !*ST*5 and !*ST*8 are Warbird vessels. Further, the lengths of three ships for which there are reliable reports were also considered. The result of the inference process was not only an estimate of the length of !*ST*8 but a clear confirmation that the data available strongly supports the hypothesis that the class of a starship influences its length. It may seem cumbersome to define different random variables, *StarshipLengthInd* and *StarshipLengthDep*, for each hypothesis about the influences on a starship’s length. As the number of structural hypotheses becomes large, this can become quite unwieldy. Fortunately, this difficulty can be circumvented by introducing a typed version of MEBN and allowing the distributions of random variables to depend on the type of their argument. A detailed presentation of typed MEBN, which also extends the standard specification to allow polymorphism is the subject of the next Chapter. SSBNs for the Parameter Learning Example This basic construction is compatible with the standard approaches to Bayesian structure learning in graphical models (e.g. Cooper & Herskovits, 1992; Heckerman* et al.*, 1995a; Jordan, 1999; Friedman & Koller, 2000) For a detailed account of the SSBN construction algorithm and Bayesian learning with MEBN logic, the interested reader should refer to Laskey (2005). There, it is possible to find the mathematical explanation and respective logical proof for the many intricate possibilities when instantiating MFrags, such as nodes with an infinite number of states, situations where we face the prospect of large finite or countably infinite recursions, what happens when the algorithm is started with an inconsistent MTheory, etc. Also, the text provides a detailed account of how to represent any First Order Logic sentence as an MFrag using Skolem variables and quantifiers. These issues go beyond the scope of this work, since the information already covered up to this point is enough for the purposes of understanding and using Multi-Bayesian Networks as the framework for extending a web language to Bayesian first-order logic expressivity. Yet, before entering the next Chapter, it is necessary to make a brief visit to the semantics of MEBN logic, understand why it is a Bayesian first-order logic, and to address its relationship with classical logic and other formalisms as well. 1.17MEBN Semantics. In classical logic, the most that can be said about a hypothesis that can be neither proven nor disproven is that its truth-value is unknown. Practical reasoning demands more. Captain Picard’s life depends on assessing the plausibility of many hypotheses he can neither prove nor disprove. Yet, he also needs first-order logic’s ability to express generalizations about properties of and relationships among entities. In short, he needs a probabilistic logic with first-order expressive power. Although there have been many attempts to integrate classical first-order logic with probability (see discussion on Section 2.5), MEBN is the first fully first-order Bayesian logic (Laskey, 2005). MEBN logic can assign probabilities in a logically coherent manner to any set of sentences in first-order logic, and can assign a conditional probability distribution given any consistent set of finitely many first-order sentences. That is, anything that can be expressed in first-order logic can be assigned a probability by MEBN logic. The probability distribution represented by an MTheory can be updated via Bayesian conditioning to incorporate any finite sequence of findings that are consistent with the MTheory and can be expressed as sentences in first-order logic. If findings contradict the logical content of the MTheory, this can be discovered in finitely many steps. Although exact inference may not be possible for some queries, if SSBN construction will converge to the correct result if one exists. Semantics in classical logic is typically defined in terms of possible worlds. Each possible world assigns values to random variables^{23} in a manner consistent with the theory’s axioms. For example, in the scenario illustrated in Figure 11, every possible world must assign value *True* to *CloakMode*(!*ST*1) and !*Z*0 to *StarshipZone*(!*ST*0) (the latter is not explicitly represented in the figure). The value of the random variable *ZoneNature*(!*Z*0) must be one of *DeepSpace*, *PlanetarySystems*, or *BlackHoleBoundary*, but subject to that constraint, it may have different values in different possible worlds. In classical logic, inferences are valid if the conclusion is true in all possible worlds in which the premises are true. For example, classical logic allows us to infer that *Prev*(*Prev*(!*ST4*)) has value !*ST*2 from the information that *Prev*(!*ST*4) has value !*ST*3 and *Prev*(!*ST*3) has value !*ST*2, because the first statement is true in all possible worlds in which the latter two statements are true. But in the scenario above, classical logic permits us to draw no conclusions about the value of *ZoneNature*(!*Z*0) except that it is one of the three values *DeepSpace*, *PlanetarySystems*, or *BlackHoleBoundary*. An MTheory assigns probabilities to sets of worlds. This is done in a way that ensures that the set of worlds consistent with the logical content of the MTheory has probability 100%. Each random variable instance maps a possible world to the value of the random variable in that world. In statistics, random variables are defined as functions mapping a sample space to an outcome set. For MEBN random variable instances, the sample space is the set of possible worlds. For example, *ZoneNature*(!*Z*0) maps a possible world to the nature of the zone labeled !*Z*0 in that world. The probability that !*Z*0 is a deep space zone is the total probability of the set of possible worlds for which* ZoneNature*(!*Z*0) has value *DeepSpace*. In any given possible world, the generic random variable class *ZoneNature*(*z*) maps its argument to the nature of the zone whose identifier was substituted for the argument *z*. Thus, the sample space for the random variable class *ZoneNature*(*z*) is the set of unique identifiers that can be substituted for the argument *z*. Information about statistical regularities among zones is represented by the local distributions of the MFrags whose arguments are zones. As stated in section 3.7, MFrags for parameter and structure learning provide a means for using observed information about zones to make better predictions about zones were not yet seen. As more information is obtained about which possible world might be the actual world, the probabilities of all related properties of the world must be adjusted in a logically coherent manner. This is accomplished by adding findings to an MTheory to represent the new information, and then using Bayesian conditioning to update the probability distribution represented by the revised MTheory. For example, suppose the system receives confirmed information that at least one enemy ship is navigating in !*Z*0. This information means that worlds in which *ZoneEShips*(!*Z*0) has value *Zero* are no longer possible. In classical logic, this new information makes no difference to the inferences one can draw about *ZoneNature*(!*Z*0). All three values were possible before that new information arrived (i.e. there’s at least one enemy starship in !*Z*0 for sure), and all three values remain possible. The situation is different in a probabilistic logic. To revise the current probabilities, it is necessary to first assign probability zero to the set of worlds in which !*Z*0 contains no enemy ships. Then, the probabilities of the remaining worlds should be divided by the prior probability that *ZoneEShips*(!*Z*0) had a value other than *Zero*. This ensures that the set of worlds consistent with the new knowledge has probability 100%. These operations can be accomplished in a computationally efficient manner using SSBN construction. Just as in classical logic, all three values of *ZoneEShips*(!*Z*0) remain possible. However, their probabilities are different from their previous values. Because deep space zones are more likely than other zones to contain no ships, more of the probability in the discarded worlds was assigned to worlds in which !*Z*0 was a deep space zone than to worlds in which !*Z*0 was not in deep space. Worlds that remain possible tended to put more probability on planetary systems and black hole boundaries than on deep space. The result is a substantial reduction in the probability that !*Z*0 is in deep space. Achieving full first-order expressive power in a Bayesian logic is non-trivial. This requires the ability to represent an unbounded or possibly infinite number of random variables, some of which may have an unbounded or possibly infinite number of possible values. We also need to be able to represent recursive definitions and random variables that may have an unbounded or possibly infinite number of parents. Random variables taking values in uncountable sets such as the real numbers present additional difficulties. Details on how MEBN handles these subtle issues are provided by Laskey (2005). To our knowledge, the formulation of MEBN logic provided in Laskey (2005) is the first probabilistic logic to possess all of the following properties: (1) the ability to express a globally consistent joint distribution over models of any consistent, finitely axiomatizable FOL theory; (2) a proof theory capable of identifying inconsistent theories in finitely many steps and converging to correct responses to probabilistic queries; and (3) built in mechanisms for refining theories in the light of observations in a mathematically sound, logically coherent manner. As such, MEBN should be seen not as a competitor, but as a logical foundation for the many emerging languages that extend the expressive power of standard Bayesian networks and/or extend a subset of first-order logic to incorporate probability. MEBN logic brings together two different areas of research: probabilistic reasoning and classical logic. The ability to perform plausible reasoning with the expressiveness of Fisrt-Order Logic opens the possibility to address problems of greater complexity than heretofore possible in a wide variety of application domains. XML-based languages such as RDF and OWL are currently being developed using subsets of FOL. MEBN logic can provide a logical foundation for extensions that support plausible reasoning. This work is geared towards that end, and the language proposed here, PR-OWL, is a MEBN-based extension to the SW language OWL. The main objective of such extension is to create a language capable of representing and reasoning with probabilistic ontologies. This technology would facilitate the development of “probability-friendly” applications for the Semantic Web. The ability to handle uncertainty is clearly needed, because the SW is an open environment where uncertainty is the rule. Probabilistic ontologies are also a very promising technique for addressing the semantic mapping problem, a difficult task whose applications range from automatic Semantic Web agents, which must be able to deal with multiple, diverse ontologies, to automated decision systems, which usually have to interact and reason with many legacy systems, each having its own distinct rules, assumptions, and terminologies. MEBN is still in its infancy as a logic, but has already shown the potential to provide the necessary mathematical foundation for plausible reasoning in an open world characterized by many interacting entities related to each other in diverse ways and having many uncertain features and relationships. In order to realize that potential, the first step is to extend the logic so it can handle complex features that are required in expressive languages such as OWL. This is the core objective of the next Chapter. The Path to Probabilistic Ontologies Representing and reasoning under uncertainty is a necessary step for realizing the W3C’s vision for the Semantic Web. The title of this Dissertation leaves no questions about our understanding that such step has to be taken via Bayesian probability theory, which not only allows for a principled representation of uncertainty but also provides both a proof theory for combining prior knowledge with observations, and a learning theory for refining the ontology as evidence accrues. A key concept for achieving that goal is the one of probabilistic ontologies, so we begin by defining what we mean when using this term. Intuitively, an ontology that has probabilities attached to some of its elements would qualify for this label, but such a distinction would add little to the objective of providing a probabilistic framework for the Semantic Web. In other words, merely adding probabilities to concepts does not guarantee interoperability with other ontologies that also carry probabilities. Clearly, more is needed to justify a new category of ontologies, and such extra justification doesn’t come from the syntax used for including probabilities. |