NeOn: Lifecycle Support for Networked Ontologies

Скачать 340.61 Kb.
НазваниеNeOn: Lifecycle Support for Networked Ontologies
Размер340.61 Kb.
1   2   3   4   5   6   7   8   9
Desired WP2 functionalities and their minimal annotations with ODO elements

4.x) Expected *early* functionalities to be specified and/or implemented in

T2.2, T2.3, T2.4, and T2.5, with examples from use case requirements

4.1 Evaluation

As any other resource that is used in software applications, ontologies' content should be evaluated before (re)using it in other ontologies or applications. ''Evaluation'' of the content is critical before integrating ontologies in final applications.

''Ontology evaluation'' [8-A] is defined as a technical judgment of the content of the ontology with respect to a frame of reference during each phase and between phases of their lifecycle. Ontology evaluation includes ontology verification and ontology validation.

Ontology evaluation should be carried out on the following elements:

Each individual definition and axiom.

Collections of definitions and axioms that are stated explicitly in the ontology.

Definitions that are imported from other ontologies.

Definitions that can be inferred from other definitions and axioms.

''Ontology Verification'' [8-A] refers to building the ontology correctly, that is, ensuring that its definitions implement correctly the ontology requirements and competency questions, or function correctly in the real world.

''Ontology Validation'' [8-A] refers to whether the meaning of the ontology definitions really model the real world for which the ontology was created. The goal is to prove that the world model (if it exists and is known) is compliant with the world modeled formally.

Finally, ''ontology assessment'' [8-A] is focused on judging the understanding, usability, usefulness, abstraction, quality and portability of the definitions from the user´s point of view. Different kind of users and different kind of applications require different means of assessing an ontology.

Evaluation is a functionality which allows NeOn users to perform a diagnostic task over the elements, processes and attributes of any given ontology. In NeOn, this functionality is supported by providing the users with a model for ontology evaluation and validation called oQual. oQual defines a set of quality-oriented descriptions (qoods) and parameters that range over the attributes obtained from three different set of measures: structural, functional, and usability-profiling measures. Structural measures are typical of ontologies represented as graphs. Functional measures are related to the intended use of an ontology and of its components. Usability-profiling measures, finally, depend on the level of annotation of the considered ontology. oQual allows to devise the set of criteria for choosing an ontology over others in the context of a given project.

Evaluation is the basis for the selection task. A detailed state-of-the-art review and an extended presentation of measure types are given in [GAN05b]. An introduction to the oQual model and a discussion of its relation with selection are given in [GAN06a] and [GAN06b].

Available Models and Tools:

Evaluation by structure measuring

The structural dimension of ontologies focuses on syntax and formal semantics, i.e. on ontologies represented as graphs. In this form, the topological, logical and meta-logical properties of an ontology can be measured by means of a context-free metric.

Cohesion Metrics

Ontology cohesion refers to the degree of the relatedness of OWL classes, which are semantically/conceptually related by the properties. An ontology has a high cohesion value if its entities are strongly related. The idea behind this proposal is that the concepts grouped in an ontology should be conceptually related for a particular domain or a sub-domain in order to achieve common goals. A number of cohesion metrics are defined in [YAO05].

Evaluation by function measuring

Most of the literature on ontology evaluation focuses on functionality-related issues, i.e. issues that are related to the intended use of an ontology and of its components in given contexts. The following are some of the proposed methods for evaluating the functionality of an ontology:


OntoMetric [LOZ04] is a mathematical method for scaling priorities in hierarchical structures with the goal of helping users in the choice of the appropriate ontology for a new project. The functions supported by OntoMetric are the ordering by importance of project objectives, the qualitative analysis of candidate ontologies, and the quantitative measure of the suitability of each candidate. The main drawback of OntoMetric is related to its usability: specifying the characteristics of an ontology is complicated and time-consuming; assessing its characteristics is quite subjective. On top of this, the number of use cases is limited, which is an important obstacle to defining (inter- subjective or objective) parameters based on a large enough number of comparable cases.


OntoClean [WEL01] has the goal to detect both formal and semantic inconsistencies in the properties defined by an ontology during pre-modelling and modelling stages, i.e. during ontology development. The main function of OntoClean is the formal evaluation of the properties defined in the ontology by means of a predefined ideal taxonomical structure of metaproperties.


ODEClean [FER02] is a pluggin which has been created to support the OntoClean method in WebODE. The criteria used to evaluate the ontology are expressed declaratively in WebODE conceptualisation module. It is based on Guarino and colleagues’ top-level ontology of universals, enriched with metaproperties (rigidity, identity, unity, dependency) and with the evaluation rules proposed by OntoClean. The main functions provided by ODEClean are: to establish the evaluation mode, to assign meta-properties to concepts, to focus on rigid properties, and to evaluate according to the taxonomic constraints.


The main goal of EvaLexon is to evaluate at development time ontologies that are manually created from text [SPY05]. In sharp contrast with OntoClean, EvaLexon is meant for linguistic rather than conceptual evaluation. Its main function is the measurement of how appropriate are the terms (to be) used in an ontology. A term is judged more or less appropriate depending on its frequency both in Regression allows for direct and indirect measurement of the ontology’s recall, precision, coverage and accuracy.


This framework is meant to be used by domain experts and ontology makers who are not familiar with implementation environments, and has the goal to let them build ontologies from scratch [FER04]. To this end, a number of functions are provided that enable easier intermediate representations of ontologies. Such representations are meant to bridge the gap between how people think about a domain and the languages usually used to define ontologies at the formal level. In other words, Methontology makes it possible to work on ontologies at the knowledge level only, and it does so by supporting functions like the specification of the ontology development process as well as of its life-cycle (based on evolving prototypes); the specification of ontologies at the knowledge level; the multilingual translation that automatically transforms the specification into several target codes.

NLP-driven techniques for content evaluation

When an ontology is lexicalized (i.e., it defines, at least to some extent, what instances of classes and relations are called in natural language) and there exists a substantial amount of textual documents that contain information about the content of the ontology, both automatic and (semi-)automatised NLP-driven techniques can be applied to estimate empirically, either directly or indirectly, the accuracy and the coverage of the ontology. Cf. e.g. [BER99], [BRE04], [CIA05], and [DAE04].

Task-based evaluation

The goal of this approach is to evaluate ontologies with respect to three basic levels: vocabulary, taxonomy and (non-taxonomic) semantic relations [POR04b]. The functions proposed by task-based evaluation rest on two key arguments: the task and the gold standard. The task needs to be sufficiently complex to constitute a suitable benchmark for examining a given ontology. The gold standard is a perfectly annotated corpus of part-of-speech tags, word senses, tag ontological relations, given sets of answers (so-called keys) used to evaluate the performance of algorithms that are run on the ontology to perform the task.

When changes are made to an ontology and different versions are produced, e.g. as a result of collaborative editing and annotation, tools for evaluation and visualisation are of paramount importance in order to be able to assess the different versions and to visualise changes. Evaluation metrics (and tools to perform these metrics) enable the user to evaluate generated and populated ontologies and compare changes over time. However, since evaluation cannot always be carried out automatically, we may also require visualisation tools to enable the expert user to view changes between versons of generated ontologies and to perform his own evaluation on them. Most evaluation is carried out in terms of Precision and Recall, since this is a standard metric that has traditionally been used for evaluating information extraction and related tasks. However, a better means of evaluation the ontology population task is to take into account distance in the ontology between the 2 versions, rather than having a binary metric which simply considers two versions as identical or not.

In this context, GATE provides some useful tools for automatic evaluation: in particular, the AnnotationDiff tool and the Benchmarking Tool. These are particularly useful not just as a final measure of performance, but as a tool to aid system development by tracking progress and evaluating the impact of changes as they are made. The evaluation tool (AnnotationDiff) enables automated performance measurement and visualisation of the results, while the benchmarking tool enables the tracking of a system’s progress and regression testing.

The AnnotationDiff tool enables two sets of annotations on a document to be compared, in order either to compare a system-annotated text with a reference (hand-annotated) text, or to compare the output of two different versions of the system (or two different systems). For each annotation type, figures are generated for precision, recall, F-measure and false positives. Display of the results is similar to other visual diff tools such as tkdiff, where the results from each version are displayed on the same line and colour coded

The Corpus Benchmark tool differs from the AnnotationDiff in that it enables evaluation to be carried out over a whole corpus rather than a single document. It also enables tracking of the system’s performance over time. Performance statistics will be output for each text in the set, and overall statistics for the entire set. In the default mode, information is also provided about whether the figures have increased or decreased in comparison with the annotated set. The processed set can be updated at any time by rerunning the tool in generation mode with the latest version of the system resources. The output of the tool is written to an HTML file in tabular form, for easy viewing of the results.

The CorpusAnnotationComparison tool does the same thing as the Corpus Benchmark tool, but presents the results in a different way. Given two corpora with some terms annotated, it computes the correct, missing and spurious terms, and presents the output graphically using the Cluster Map format.The advantage is that this allows the user to directly access the three main term sets, to visualise recall and precison and to visually compare the performance of different methods.

The OntologyBuilderDisplay tool enables the visualisation of the ontology once it has been generated from the corpus. This visualisation supports the evaluation of the ontology by a domain expert by enabling him to see how certain concepts were derived, in that it allows access to the documents where the concept appears. The expert can then analyse how certain concepts interrelate at the document level, which can lead to the derivation of further conceptualizations. In this situation, the evaluation is done by an expert who relies on his knowledge to decide if a concept is relevant for the domain. Therefore he does not perform a comparison with a Gold Standard ontology. The tool was built to help the expert understand the extracted ontology and to decide which concepts to keep and which ones to delete.

The OntologyBuilderSourceDisplay tool utility shows the overlap of concepts extracted from different sources, allowing the user to filter out those appearing in most sources. This is useful in cases where ontologies have been derived from a combination of different document sources. Naturally, concepts that are present in all (or most) sources should be the most significant for ontology building, because an ontology represents a "shared conceptualization".

The ontology comparison tool developed by researchers at the University of Karlsruhe tests how well an ontology has been generated with respect to a gold standard ontology, or simply to compare two ontologies. It compares the ontologies at the structural level, i.e. in terms of the concepts and their positioning in the hierarchy, and does not take into account the instances with which it is populated (if such exist). The measures used are defined in [MAE01].

The Learning Accuracy (LA) tool is a Java implementation developed by researchers at the University of Karlsruhe, which calculates Learning Accuracy [HAH98] for one or more populated ontologies, compared with a gold standard. The Learning Accuracy metric is described in detail in [CIM05]. Essentially, this measure provides a score somewhere between 0 and 1 for any concepts identified in an incorrect position in the ontology, giving an indication of how serious the error (or difference) is, and weighting it accordingly.

The BDM tool was developed by researchers at the University of Sheffield in order to overcome some of the problems faced by traditional metrics for evaluation when dealing with ontologies. The tool applies a Balanced Distance Metric (BDM) [MA06] to compute semantic similarity between two semantic annotations of the same token in a document. It differs from Learning Accuracy in that it considers other factors such as the average chain length and the branching factor of the nodes in question, and is bi-directional where the LA metric is uni-directional.

Evaluation by usability-profiling measuring

In [NOY04] it is argued that, although most structural and functional evaluation methods are necessary, none of them is really helpful to ontology users, who need to discover which ontologies exist and, more importantly, which one suits best for their current task. Usability-profiling measures focus on the ontology profile, which typically addresses the communication context of an ontology. An ontology profile is a set of ontology annotations: the metadata about an ontology and its elements, containing information about structural, functional, and user-oriented properties of an ontology. Presence, amount, completeness, and reliability are the usability measures ranging on annotations.

Evaluations provided by the users - The idea of Open Rating Systems

Motivation for Open Rating Systems

In certain environments, like the WWW, due to the ever increasing amount of content, it is not feasible to have editors or dedicated authors review all content. The only possible way to economically address this problem is to allow potentially everyone to write reviews. Since this can cause reviews of bad quality, the reviews themselves have to be rated by the community. The strength of this review lies in the concept of meta-rating [NoGM05] and introducing trust relationships between users to enable the use of a web of trust [Guha03].

How do Open Rating Systems work

By stating trust relationships to other users either implicitly by rating their reviews or explicitly by stating to trust or distrust them, users connect themselves to a Web of Trust between all users [GKRT04]. Based on metrics applied on those Webs of Trust, user reviews can be ordered based on a users personal preference and in a way that they provide the maximal helpfulness. Based on the best user reviews, content in the Open Rating System can be ordered (see for a site using this system).

Motivation for using Open Rating Systems in the context of ontologies

While ontologies have proven to be a useful tool for different application contexts, their creation is still expensive and reuse is rare. Given a user would know about the quality of an ontology, the decision which ontology to reuse could be facilitated and result in better ontology reuse. For that, the existing Open Rating System model has to be adapted to serve the specific needs of the ontology engineering community. First efforts have been presented at the EON2006 workshop by Lewen [LSNM06]. They present an extended model that allows the expression of topic-specific trust and show algorithms to compute an overall ranking for an ontology based on user-specific needs. However, there is still work to be done and a complete evaluation is still outstanding.

Existing Tools and Platforms:

Stanford Medical Informatics have launched an Ontology Repository at which allows users to enter ontologies and review them. It also features some rating system functionality that will be extended in the near future.

Onthology and Onthology Ratings ( developed at AIFB Karlsruhe also provide similar functionality. UPM offers a P2P-client called Oyster ( that uses the same metadata standard (OMV:

1   2   3   4   5   6   7   8   9


NeOn: Lifecycle Support for Networked Ontologies iconThe Use of uml as a Tool for the Formalisation of Standards and the Design of Ontologies in Agriculture

NeOn: Lifecycle Support for Networked Ontologies iconEnterprise Architecting Lifecycle Management

NeOn: Lifecycle Support for Networked Ontologies icon[edit] Primary lifecycle processes

NeOn: Lifecycle Support for Networked Ontologies iconNetworked Peers For Business

NeOn: Lifecycle Support for Networked Ontologies iconThe importance of trust in the digital networked economy

NeOn: Lifecycle Support for Networked Ontologies iconDans – Data Archiving and Networked Services

NeOn: Lifecycle Support for Networked Ontologies iconVirtual Collaborative Learning Environments for Music: Networked DrumSteps

NeOn: Lifecycle Support for Networked Ontologies iconРуководитель программы: проф. В. А. Иванов Кафедра оптики Научный доц. Ю. Э. Скобло Рецензент: доц. А. А. Пастор Study of the population processes of neon atom

NeOn: Lifecycle Support for Networked Ontologies iconHighly versatile Senior System Administrator, skilled in the architectural design and implementation of high-availability systems for all major networked

NeOn: Lifecycle Support for Networked Ontologies iconNew Media is the term used for networked computerized or digital technologies that permeate society. There are many definitions of New Media, depending on the

Разместите кнопку на своём сайте:

База данных защищена авторским правом © 2014
обратиться к администрации
Главная страница