Conference Trip Report
2004 Wilshire Meta-Data Conference
& DAMA International Symposium
Century Plaza Hotel, Los Angeles, California May 2-6, 2004
Compiled and edited by Tony Shaw, Program Chair, Wilshire Conferences, Inc.
This report is provided as a service of Wilshire Conferences, Inc.
The 2004 Wilshire Meta-Data Conference and DAMA International Symposium was held before an audience of approximately 900 attendees and speakers. This report contains a chronological summary of the key discussions and conclusions from almost all of the workshops, tutorials, conference sessions, and special interest groups.
A CD-ROM including all of the conference presentations is available for purchase. To order the CD-ROM, or to receive more information about this conference, and related future events, go to http://www.wilshireconferences.com
This document is © Copyright 2004 Wilshire Conferences, Inc. It may be copied, linked to, quoted and/or redistributed without fee or royalty only if all copies and excerpts include attribution to Wilshire Conferences and the appropriate speaker(s). Any questions regarding reproduction or distribution may be addressed to firstname.lastname@example.org.
Sunday, May 2
3:30 pm – 6:45 pm
Data Modeling Basics
Everything you need to know to get started
Marcie Barkin Goodwin
President & CEO
Axis Software Designs, Inc.
This workshop for novices covered the basics of data modeling, including an explanation of:
IDEF & IE Methods
Entities, Attributes & Relationships
Keys, Cardinality, Recursion & Generalization Hierarchies
Standards & Procedures
Abstraction: the Data Modeler’s Crystal Ball
Global Reference Data Expert
Steve explained abstraction as redefining entities, relationships, and data elements into more generic terms. He went through an example illustrating how to abstract, and stressed the importance of apply the Abstraction Safety Guide. This guide contains 3 questions we must ask ourselves before abstracting. The 3 questions are "Do we have something in common in our model?", "Is there value in abstracting?", and "Is it worth the effort?" Steve then put abstraction in context with normalization and denormalization. He explained subtyping and meta data entities, and shared with the group a sad story and a happy story where abstraction was used in the production data warehouse. The group then broke down into smaller groups and did a short workshop on abstracting contact information. He ended the session discussing reusable abstraction components and explained universal models and their uses.
Enterprise Architecture Principles and Values - Straight from the Source!
In the Industrial Age, it was the products (airplanes, buildings, automobiles, computers, etc.) that were increasing in complexity and changing. We had to learn how to describe them (architecture) in order to create them and maintain them (that is, change them) over time. In the Information Age, it is the Enterprise that is increasing in complexity and changing. By the same token, we will have to learn how to describe them (architecture) in order to create them to accommodate dramatic increases in complexity and keep them relevant as they change over time. The Framework for Enterprise Architecture defines the set of descriptive representations that constitutes Enterprise Architecture and establishes a semantic structure to understand the physics of Enterprise engineering.
For those who understood the value proposition, Enterprise Architecture has always been important. Yet it is only relatively recently that the concepts and benefits of Enterprise Architecture have started to be embraced by a significant number of organizations at a senior level.
John explained how and why Enterprise Architecture provides value, and the four reasons why you "do" Architecture including alignment, integration, change management and reduced time to market. Without Architecture, there is no way you can do any of these things.
Process Orientation for Data Management Professionals:
Proven Techniques for Achieving Support and Relevance
Clariteq Systems Consulting Ltd.
Business Process has returned as a hot topic:
economic and regulatory reality (e.g., Sarbanes-Oxley) demands it
systems are far successful when implemention is “process-aware”
Adopting process orientation is a great way for Data Management professionals to add value while promoting the importance of their discipline. Some specifics covered:
A common mistake is to identify a functional area’s contribution as a complete process, leading to the common problem of “local optimization yielding global suboptimization.”
Workflow modeling is a physical modeling technique, which causes difficulty for many analysts who are used to essential (or logical) modeling
Showing the often-ugly reality of as-is processes raises awareness of the true cost of inconsistent, disintegrated data.
Processes only perform well when all of the six enablers work well – workflow design, IT, motivation and measurement, policies and rules, HR, and facilities.
A Data Stewardship "How-To" Workshop:
De facto, Discipline & Database
President & Principal/Publisher
KIK Consulting & Educ Services/TDAN.com
New sets of rules, regulations and legislation are beginning to require formalized accountability for data assets and how they are managed across the enterprise. The business and technical value of data stewardship is viewed as insurance, driven by financial reporting, data integration requirements, accounting needs and the audit-ability of data and data focused processes.
Students of the workshop learned about the proven components of effective Data Stewardship programs and how to immediately customize a strategy that will deliver results with limited financial investment. Mr. Seiner focused on best practices and “how-to” develop a pragmatic and practical solution to implement a data stewardship program. Attendees of this class learned:
- How to formalize accountability for the management of enterprise data
- How to recognize the need to implement a data stewardship program
- How to frame and sell data stewardship in the corporate context
- How to insert data stewardship into data management policy
- How to define objectives, goals, measures/metrics for stewardship programs
- How to align steward roles with data through the “de facto” approach
- How to identify, implement, enforce the “discipline” through process integration
- How to build & implement a stewardship repository or “database”
Fundamentals of Data Quality Analysis
Loma Linda University
You can’t assess the quality of data without understanding its meaning. And the meaning of data is derived, in part, by its position in a data architecture (both physical and logical). Therefore, Mr. Scofield began with a review of the nature of logical business data architecture, and how it affects the meaning of data. While data architecture is abstract, data model is a way of describing it, and achieving consensus on current and future reality for an enterprise.
Next, we dealt with doing a data inventory. Gathering knowledge about the data asset is both technical and political. One often encounters a lead time or critical path in getting permission from some parts of large organizations to get read access to the data. Then, building up the meta-data, one surveys databases (even topically-organized clumps of data not necessarily in a relational DBMS). Inventory the databases, the tables in each (with their meaning and behavior), and the fields in each table (with meaning, behavior, and quality).
The center part of the workshop focused upon data profiling or understanding the behavior and quality of the data asset. While there are some commercial tools available to support this effort, the crucial element is a cynical data analyst who understands the business. Central to data profiling is the domain study which surveys data behavior in an individual column, showing the anomalous values observed. Once anomalies are discovered, their causes must be determined, often through dialogue with application experts or business experts. In such a dialogue, having a formatted dump of the records in question is useful. The answers should be clearly documented. Graphic techniques were also shown for finding anomalies in the data and understanding them.
Data quality has many facets, including presence, validity, reasonableness, consistency, accuracy, and precision. The method of delivery (ease of use, timeliness, etc.) should not be confused with data quality (even though it often is). So the utility of data (suitability for a particular purpose and availability) is independent to the intrinsic quality of the data.
Finally establishing a data quality program cannot be done by one person. It requires involving the “owners” of all the processes around an enterprise. Data quality measurement can be done by an expert, although it is valuable to provide process owners the means to measure their own data quality on a recurring basis. Only they can improve their processes and correct the systemic factors (often in manual procedures, screen design, but also in business processes and incentives for the data gatherer) which cause the bad data in the first place.
New Trends in Metadata:
Metadata is not Just Repositories Anymore
My Krisalis, Inc.
This workshop explained many of the new concepts that surround meta-data movement and integration. Those who attended now have an understanding and appreciation for the following:
Meta-Data is NOT synonymous with "repository"
The industry, including repository, ETL, BI and DW vendors, are beginning to offer meta-data exchange capabilities
Standards, such as CWM, UML and IDEF, have lead to a proliferation of integration, or "glue", providers that can be leveraged for metadata movement
New meta-data movement issues to contend with include:
- Meta-data warehousing
- Enterprise meta-data application integration
- Meta-data ETL
- Business meta-data intelligence
Acquiring, Storing and USING Your Meta Data
Area Leader, Allstate Data Management
Allstate Insurance Company
Effective meta data management is much more than just a matter of identification and collection. The whole point is that people in the organization...users, analysts and developers...actually USE the meta data you're making available to them. This workshop provided a case study of how they make meta data work at Allstate, including:
Overview of the meta data environment at Allstate
You must put meta data to work!
Capturing meta data
Storage of meta data
Concept of domains
Meta data for the warehouse environment
Meta data to support run-time applications
Return on investment - thoughts/examples for calculating ROI
Unstructured Data Management
Unstructured data management covers text, sound, images, HTML files, and other formats. Managing such data is based on applying category tags that can be sorted, selected and otherwise managed like structured data. Categories are called "taxonomies"; creating and maintaining a relevant taxonomy often takes more work than expected. Tags can be applied manually, but that's very expensive. Automated systems fall into two broad categories, statistical and semantic. Statistical systems analyze samples of records in each category to identify the words and patterns that distinguish them. They are relatively easy to deploy and computationally efficient, but not necessarily precise. Semantic systems analyze the actual language of a document using preexisting knowledge such as grammatical rules, dictionaries and "ontologies" (descriptions of relationships among concepts). They are better at extracting specific information from documents, as opposed to simply identifying the category a document belongs to. As with any system, defining specific requirements is essential to selecting the right tool for a particular situation.
Models, Models Everywhere, Nor any Time to Think
A Fundamental Framework for Evaluating Data Management Technology and Practice
Analyst, Editor & Publisher
The speaker maintains that the majority of data management practitioners operate in “cookbook”, product-specific mode, without really knowing and understanding the fundamental concepts and methods underlying their practice. For example: what data means, what is a data model, data independence, etc. This workshop provided a fundamentally correct way to evaluate data management technologies, products and practices. It helped practitioners understand data fundamentals that are either ignored or distorted in the industry, how to apply these fundamentals in day to day business, and how to use them in the evaluation of the technologies being promoted.
Sunday, May 2
7:00 pm – 8:00 pm
Nationwide Mutual Insurance
Ron Borland's presentation on Information Assurance addressed the need for enterprises to develop a coordinated approach to improving information quality. Ron's premise is that, only by actively involving all the stakeholders directly in the data quality process can we have a reasonable expectation that they will agree on the success criteria by which the effort is measured. The three key takeaways from Ron's presentation are:
Get the right people involved early and keep them actively and enthusiastically involved
Develop a culture that supports a senior escalation point for data quality issues from which there is no appeal
Insure that every point where the process touches reality adds value and eliminates bottlenecks
Data Management Practice Maturity Survey
Institute for Data Research
Paladin Integration Engineering
Over the past two years, the Institute for Data Research has surveyed more then 40 organizations of differing sizes - from both government and industry. The results of this survey are permitting the development of a model that can help organizations assess their organizational data management practices. Good data management practices can help organizations save the 20 - 40 % of their technology budget that is spent on non-programmatic data integration and manipulation (Zachman). This talk described the Data Management Practice Maturity Survey and presented the results to date.
Introduction to ORM
Professor and VP (Conceptual Modeling)
Effective business rules management, data warehousing, enterprise modeling, and re-engineering all depend on the quality of the underlying data model. To properly exploit database technology, a clear understanding is needed as to how to create conceptual business models, transform them to logical database models for implementation on the chosen platform, and query the populated models. Object-Role Modeling (ORM) provides a truly conceptual way to accomplish these tasks, facilitating communication between the modeler and the domain expert.
Lies, Damn Lies, and Enterprise Integration: A Fair and Balanced Perspective
Evan Levy gave attendees a lively look at data integration alternatives. Variously examining Enterprise Resource Planning (ERP), Enterprise Application Integration (EAI), Enterprise Information Integration (EII), and data warehousing, Evan analyzed the comparative applications of each of the four environments, deconstructing their functional differences. He provided a perspective on the vendors who play in each of the four arenas, and discussed the relative strengths and weaknesses of the various integration approaches, focusing--often with his tongue firmly planted in his cheek--on specific business drivers best-suited for each. Evan relayed some real-life client experiences with different integration technologies, and concluded his talk with predictions for the future of all four integration options.
Metadata and Grid Computing
Knowledge Consultants, Inc.
Lake, Vesta and Bright
Grid computing is getting a lot of press today as the ‘next big thing’ that will impact the enterprise. It is true that grid computing exists today in several variations such as the Seti project and platform load-sharing in an enterprise that offloads processing capabilities across machines and takes advantage of unused capacity to run applications. However, making the concept useful, practical and a ‘value add’ for running applications on an external grid for the enterprise requires some significant convergence of technologies and disciplines. To make the grid idea useful within an enterprise we need an extended grid concept. Moving the extended grid concept to the business landscape and more specifically to the enterprise environment requires four important capabilities to make the grid viable; clean and standardized metadata and metadata services, a bill of materials for an application, a routing list for execution and a costing mechanism for billing.
Semantics in Business Systems
Dave McComb introduced semantics as the underlying discipline behind much of what we do as data modelers. After outlining where Business Vocabulary, Taxonomy and Ontology fit in, he introduced several new concepts, including: Fork Genera (why it is that most of the knowledge we have is at the middle level of a taxonomy), and how the Semantic Web is structured. He then described a framework for reasoning about the semantic content of our systems, which included factoring meaning into semantic primes, contexts and categories.
Monday, May 3
8:30 am – 4:45 pm
Business-Focussed Data Modeling
Unifying Conceptual, Logical, Physical Models
This tutorial started from the position that while there is general understanding about what a physical data model is, there are many different interpretations of the terms “conceptual data model” and “logical data model”. The underlying premise of the tutorial was that the process of developing a database design that accurately reflects business information requirements must start with the production of a model of business concepts that not only uses business terminology but includes structures that reflect reality rather than the dictates of normalisation or the need to fit a pre-SQL3 relational straitjacket. Graham adopted the term “conceptual data model” for such a model, which may include any of the following:
supertypes and subtypes
business attribute types, which describe an attribute’s behaviour, rather than DBMS datatypes
category attributes rather than classification entities
complex attributes, with internal structure
derived attributes & relationships
attributes of relationships
The tutorial also covered practical techniques for modeling with these component types when using a CASE tool that may not support one or more of them, as well as ways of converting them to a pre-SQL3 relational model (“logical” or “first-cut physical” depending on your preference).
A conceptual data model cannot be developed without a thorough understanding of the business information requirements, so practical tips for requirements discovery were also provided, in particular the development of an Object Class Hierarchy. Rather than being a traditional E-R or Object Class Model, an OCH consists of Business Terms (which may be Entity, Relationship or Attribute Classes or even Instances), each with Definitions and organized hierarchically with inheritance of properties. It provides a high level of business buy-in, and a genuinely reusable resource, since application projects retain confidence in the validity of the content. Some practical tips for business rule discovery and documentation were also imparted, as well as a dissertation on the importance of unambiguous names and definitions.
Review of the conceptual data model is essential so Graham then described how to generate a set of assertions (or verbalizations) about the model which can be reviewed individually, each reviewer disagreeing, agreeing or seeking clarification.
Developing Better Requirements and Models Using Business Rules
Ronald G. Ross
Business Rule Solutions, LLC
Gladys S.W. Lam
Business Rule Solutions, LLC
This tutorial explained how the business rule approach can improve your entire requirements process. Starting with the business model, the speakers identified relevant deliverables and showed where business rules fit in with them. Specifically, they showed how business rules address the issues of motivation and guidance – in other words, the question of “why.” They detailed how you can use business rules to develop business tactics in a deliverable called a Policy Charter.
Continuing, they focussed on the system model and how business rules fit with each deliverable. Special emphasis was given to how business-perspective rules differ from system-perspective rules, and what you need to do to translate between the two. Practical refinements to system model deliverables were examined, not only to exploit business rule ideas, but also to maintain a consistent focus on validation and communication from the business perspective.
Finally, Mr. Ross and Ms. Lam offered guidelines for how business rules should be expressed, and how they can be managed more effectively.
Using Information Management to Sustain Data Warehouse
Information management is vital to sustaining a data warehouse. This means the DA areas are now accountable for more than the models. John Ladley reviewed how DA areas are on the front line of developing business cases, enabling new business processes, and assisting in culture change. Several key techniques were presented that leverage the strengths of DA areas (abstraction , modeling, governance) to helping organization better align and sustain data warehouses.
How to Develop an Enterprise Data Strategy
Sid Adelman Associates