University of Virginia
Department of Information Technology and Communication (ITC)
Applications and Data Services Division (ADS), Data Services
June 1999
Contents
- Forward and Scope
- Knowledge Management Framework for Data Administration
- Data Administration Service Model
- Glossary
- Appendix A: Knowledge Domains for the University of Virginia
- Works Cited
Acknowledgement
I would like to express my gratitude to the following individuals:
- Deborah Mills
- Manager, Enabling Technologies, University of Virginia
- Terrell Jones
- Database Administrator, Data Services, University of Virginia
- Debbie Luzynski
- Database Administrator, Data Services, University of Virginia
- Gary Policastro
- Manager, Domain Development, Fairfax County Public Schools, Virginia
- Bethann Canada
- Director, Management Information Systems, Virginia Department of Education
- Ted Davis
- Director, Knowledge Asset Management, Fairfax County Public Schools, Virginia
These individuals have contributed to the development of this handbook by sharing their experiences, insight, and/or feedback.
Ed Tyler, Manager, Data Services
University of Virginia
June 1999
Forward and Scope
The University of Virginia has embraced information as an institutional resource that is critical to its success. To protect the value of and to ensure accessibility to this resource, the University adopted the "Administrative Data Access Policy" in 1994. The cornerstone of the policy was the philosophy that information must be "available to all employees who have a legitimate need."
To nurture this philosophy and to leverage the University's investment in its information resources, Data Services has the responsibility to develop and implement effective and cost efficient data administration practices. Data Services also delivers services and products that assist the University community in locating, accessing, and employing institutional data. Finally, Data Services serves as the leading advocate for guiding the data access philosophy into one that fosters knowledge management.
Therefore, the purpose of this document is to serve as a handbook for Data Services. The handbook describes the objectives and standards adopted for data management. The handbook is divided into three volumes as listed below. This volume describes the philosophy underlying the objectives and standards adopted. As such, it establishes the context for the remaining volumes.
- Volume 1: Definition of the Data Administration Service Model
- Volume 2: Standards for Data Administration
- Volume 3: Standards for Database Administration
Knowledge Management Framework for Data Administration
Every enterprise acquires and uses a variety of resources in order to achieve its goals. Knowledge is one such resource. Like any resource, knowledge has a life cycle spanning from creation to consumption and ending with obsolescence.
For purposes of data administration, knowledge is acquired when an individual applies his or her experiences, skills, and insights to information. To create knowledge, an individual integrates three components--data, context, and experience--as illustrated in Figure 1 below. To tap into the knowledge held by its employees, stakeholders and customers, an enterprise must foster an environment conducive to information sharing. Within such an environment, an enterprise has the means to leverage knowledge as an institutional resource.
Figure 1: Overview--Transformation of Data into an Institutional Knowledge Resource
The challenge lies in defining an effective and cost efficient strategy that enables such an environment. The current research and development for such solutions have focused on the area of knowledge management. Knowledge management is the "process by which individual learning and experience can be accessed, reflected upon, shared and utilized in order to foster enhanced individual knowledge and, thus, organizational value" (Coleman & Furey). The policies and strategies adopted to facilitate this process provide the framework for managing the components of knowledge.
Data administration facilitates the management of the data component of knowledge. Specifically, data administration is the organizational function that develops long-term conceptual plans for institutional data. Activities include strategic information resource planning, data standardization, data synchronization, and database development and maintenance. The primary products and services include the enterprise data architecture and the metadata--a catalog or index of the enterprise's knowledge resources. By defining, organizing, and protecting institutional data, the requirements of "controlling the acquisition, analysis, storage, retrieval, and distribution of data" are addressed (Newton & Wahl, 2-4, 2-6, B-3).
Data administration contains two related processes--database administration and tools administration. Database administration controls the "content, design, and use of one or more databases to avoid uncontrolled redundancies and to enhance development" of information technology solutions (Newton & Wahl, B-4). While the data administrator assumes the role of an architect, the database administrator builds the knowledge repository from the blueprints drafted. Thus, unlike data administration, database administration has a technical orientation that focuses on the physical databases deployed and on the tools used to manage, access, and manipulate data (Newton & Wahl, 2-4, 2-6).
Tools administration identifies, deploys, and controls the utilities that facilitate data management. Tools include software that assist in information modeling, database management, metadata management, data propagation, data access and control, and performance monitoring and tuning. The tools used must be consistent with the knowledge management strategies adopted.
Figure 2 depicts the umbrella-like effect of knowledge management on data management practices. The depiction is based on the concept described by Judith Newton and Daniel Wahl in their NIST publication, Manual for Data Administration (2-2).
Figure 2: Knowledge Management Influence
Data Administration Service Model
The primary goal of data management is to facilitate the systematic process by which knowledge is acquired and applied as an institutional resource. The remainder of this volume describes the University's data administration service model.
Service Provider
Data Services within the Applications and Data Services Division of Information Technology and Communication (ITC) delivers data management services and products to the University community.
Charter
The responsibilities for Data Services were defined in paragraph 4.10 of the Administrative Data Access Policy adopted by the University:
Data Management develops and applies standards for the management of institutional data and for ensuring that data are accessible to those who need it. The manager of the Data Management group chairs the Data Steward and Data Custodian committees and works very closely with these committees on formulation of data policies, standards, and procedures.
Data Management works with the Data Stewards to establish long-term direction for effectively using information resources to support University goals and objectives.
Data Management creates logical data models of applications. These models are ultimately used to create an institution-wide data model that cross-references data across applications and encourage data sharing.
Data Management develops a standard method for naming and defining data. It also facilitates conflict resolution in data definitions.
Data Management makes institutional data available to authorized users in a manner consistent with established data access rules and decisions. It develops views of data as directed by the data committees. The group ensures that the technical integrity of the data is maintained and, in conjunction with the ITC Security Team, that data security requirements are met."
Principles
The following principles guide the assessment and selection of the practices and standards adopted for data administration.
- Knowledge is critical to the success of the University.
- The needs of individuals who require knowledge are infinitely diverse.
- Knowledge must be transformed into a strategic, shareable institutional resource.
- The knowledge management practices adopted must be effective and cost efficient.
Objectives
Data Services must achieve and sustain the following objectives.
- Deploy strategies that transform knowledge into a strategic, shareable institutional resource.
- Use a systematic approach to model information and to design databases.
- Create an architecture that consolidates conceptual and physical data models to the information needs and functions of the University (Newton & Wahl, 4-1).
- Promote active data stewardship with defined accountability.
- Promote data consistency and standardization throughout the University by developing standards for names, definitions, values, formats, and metadata (Newton & Wahl, 4-1).
- Minimize duplication in collecting, processing, storing, and distributing information (Newton & Wahl, 4-1).
- Improve the quality, accuracy, integrity, and relevancy of institutional information. (Newton & Wahl, 4-1).
- Improve knowledge management and access through the use of appropriate resources, methods, tools, and technologies (Newton & Wahl, 4-1).
- Adopt effective and cost efficient data management tools and practices.
- Promote and/or deliver products and services that empower knowledge workers to access and apply institutional information.
- Educate the University.
- Increase awareness of the value of institutional information.
- Advocate the strategic value of knowledge management.
- Encourage and facilitate knowledge sharing within the University and among its partners (Newton & Wahl, 4-1).
- Measure the quality, effectiveness, and cost efficiency of the products and services delivered.
- Require each member of Data Services to maintain currency of technical knowledge; sustain proficiency of skills; acquire a familiarity
and an understanding of the University's functions; and to have an awareness of industry trends.
- Provide opportunities to gain first-hand knowledge of the University's functions.
- Provide technical staff development opportunities.
- Accommodate self-motivated educational pursuits.
- Establish collaborative partnerships with external entities.
Use-Case
The use-case for data administration is summarized in Table 1. The use-case provides an overview of the work involved by describing the primary processes; the products or services delivered; the customers or benefactors; and the partners involved. The data administration use-case has three high-level processes.
Table 1: Data Administration Use-Case
Process
Foster Data Administration Service Model
Process Steps
- Identify stakeholders
- Assess environment, such as policies, practices, and technology, for managing institutional data
- Define and/or adjust vision, objectives, direction, architecture, and other products and services
- Obtain commitment and buy-in
- Communicate and advocate data management
- Monitor and evaluate service model
Products and/or Services Delivered
Data management:
- Direction
- Architecture
- Standards
- Life cycle
- Policies
- Methods
Process Customers
Enterprise
Note: The enterprise is defined as the set of organizational entities of which each is responsible for one or more institutional function or service.
The students (prospective, current, and graduated) and their families, patients, researchers, and the community (local and global) are not included within the enterprise as defined. Rather, they are viewed as customers of the University.
Process Partners
- Applications and Data Services, ITC
- Communications and Systems, ITC
- Computing Support Services, ITC
- Operations, ITC
- Stewards
Note: The business partners, foundations, and state and federal agencies are not process partners but are factors in the environment that impact knowledge management.
The Vice President and Chief Information Officer and the senior managers of ITC are not process partners but are an approval authority.
Process
Define Domain-Specific Data Management Strategy
Process Steps
- Identify stakeholders
- Assess data management problem, needs, and environment
- Develop business case (does not include problem solution definition)
- Develop strategy for data management
- Identify stewards, agents, and system sponsors
- Educate stewards, agents, and system sponsors
- Define concept of operations (function and technology)
- Define implementation approach alternatives
- Evaluate and select implementation approach
- Attain commitment and buy-in (includes funding and other resources)
- Monitor and evaluate domain architecture
Products and/or Services Delivered
Domain architecture
Notes: The products and services are the result of applying the data administration service model to a domain.
Process Customers
Domain
Notes: A domain is defined as a set of knowledge and processes that share common properties within an enterprise.
A brief description of the domains identified for the University is provided in Appendix A.
Process Partners
- Applications and Data Services, ITC
- Budget and Administration, ITC
- Communications and Systems, ITC
- Computing Support Services, ITC
- Operations, ITC
- Stewards (including domain sponsor and interfacing domain stewards)
Process
Implement and Maintain Data Management Solutions
Process Steps
- Develop detailed implementation plan, including tasks, resources, and schedules
- Execute plan
- Develop business process
- Define data
- Develop and/or acquire tools
- Establish databases
- Develop and/or deliver training
- Test solutions
- Install capability
- Evaluate progress and adjust plans
- Monitor, evaluate, and maintain solutions
Products and/or Services Delivered
Data management capabilities, such as processes, tools, skills, databases, mechanisms, and metadata
Process Customers
Individuals who use or manage processes, tools, and/or data of the domain
Process Partners
- Applications and Data Services, ITC
- Communications and Systems, ITC
- Computing Support Services, ITC
- Operations, ITC
- Stewards (including domain sponsor and interfacing domain stewards)
- Vendors
- Support teams such as LSP (Local Support Partner)
- Other entities such as state agencies (e.g. Virginia Retirement System) and foundations
Skill and Knowledge Set
To support the data administration processes described, the following set of skills and knowledge is required. The skills and experiences described are conceptual and do not imply a one-to-one correspondence with current University positions. Rather, the descriptions are provided for planning purposes.
- Enterprise Domain Architect Skill/Knowledge Set
- Characterized as having an "Enterprise" or institutional orientation rather than technology focused
- Ensures that the institution has the means to leverage its knowledge resources
- Possesses in-depth knowledge of enterprise information resource management (knowledge management)
- Develops and sustains the enterprise domain architecture (plan)
- Establishes plans, strategies and standards for data/domain management, database management, data management tool administration
- Introduces new industry practices for knowledge management
- Has familiarity with knowledge management technologies but does not possess in-depth technical knowledge
- Communicates knowledge management plans, strategies
- Obtains commitment and buy-in
- Fosters knowledge management program and model
- Obtains resources
- Prepares budget and project plans
- Supervises technical staff
- Domain Architect Skill/Knowledge Set
- Characterized as having a "Domain" or business area orientation
- Ensures that the data management solutions deployed satisfy domain requirements and is consistent with the domain architecture
- Possesses in-depth knowledge of knowledge requirements within specific domains
- Provides project management services for deploying data management solutions
- Develops conceptual models and data security requirements
- Captures metadata
- Has familiarity with database technologies
- Possesses extensive experience and skills in problem definition, solution assessment
- Excels in oral and written communication, team management, and collaborating with technical resources and with business area customers
- Has familiarity with knowledge management philosophies
- Has familiarity with system development lifecycle
- Enforces established standards
- System Database Administrator Skill/Knowledge Set
- Characterized as "DBMS Heavy" but "Application Light"
- Ensures availability, integrity, and usability of database environment
- Possesses in-depth knowledge of internal workings of DBMS software
- Has extensive experience and skills with DBMS software
- Installs database management software releases and patches
- Monitors and tunes database instance for performance
- Determines backup and recovery strategies from a DBMS perspective
- Determines system resource requirements
- Provides technical DBMS leadership and consulting service
- Is aware of the applications accessing the database but does not have detail knowledge of application
- Designs physical databases (new and enhancements)
- Possesses knowledge and experience with database connectivity technologies
- Possesses knowledge and/or experience with system administration and network administration
- Understands impact of system and network changes on database environment
- Establishes and sustains database security strategy
- Serves as Application Database Analyst for applications not having a dedicated Application Database Analyst
- Application Database Analyst Skill/Knowledge Set
- Characterized as "DBMS Light" but "Application Heavy"
- Ensures the integrity and protection of data within the database from unauthorized access or misuse
- Is aware of general workings of DBMS internals
- Has sufficient skills to create and alter tables
- Has sufficient skills to maintain database user security
- Has sufficient knowledge to participate effectively in database design sessions
- Possesses in-depth knowledge of application software
- Has extensive experience and skills with database programming
- Has extensive experience and skills with tools to sustain application
- For commercial-off-the-shelf (COTS) solutions, has skills and knowledge to apply patches or install new releases
- Has extensive knowledge of business area or problem supported by application
- Populates database with data, including data transformations
- Understands impact of business process changes on database
Data Management Tool Set
The tables on the following pages identify the current tools that are used to deliver or support data management services.
Table 2: Tools for Data Management Planning and Modeling
| Tools | Logical Modeling | Metadata Administration |
|---|---|---|
| Oracle Designer | Yes | Yes |
| Platinum Technology ERwin | Yes | |
| Platinum Technology Repository | Yes |
Table 3: Tools for Deploying and Sustaining Data Management Solutions
| Tools | Physical Database Modeling | Data Server | Database Administration (Monitoring/Tuning) |
|---|---|---|---|
| Cincom Supra | Yes | ||
| Embarcadero DBARTISAN | Yes | ||
| Oracle Database Server, Enterprise Edition | Yes | ||
| Oracle Developer | Yes | ||
| Oracle Enterprise Manager | Yes | ||
| Platinum Technology ERwin | Yes | ||
| Sybase Adaptive Server Enterprise | Yes |
Table 4: Tools for Data Access, Query, Manipulation, and Analysis
| Tools | Database Connectivity | Data Query | Data Extract, Transform, Import | On-line Analysis (OLAP) |
|---|---|---|---|---|
| Brio Query | Yes | Yes | ||
| Embarcadero DBARTISAN | Yes | |||
| Embarcadero RapidSQL | Yes | |||
| Oracle SQL*NET | Yes | |||
| Sybase JCONNECT | Yes | |||
| Sybase Open Client | Yes |
Data Architecture
To satisfy the University's information requirements, databases are deployed according to a two-layer data architecture scheme. The architecture recognizes the differentiation that exists between the data that support operational processes and the data that support analysis and decision making. The architecture also recognizes the value-added service provided by implementing an active repository of metadata.
- Layer 1: Operational Support
- Databases of the operational support layer contain transaction-oriented data that are critical to the daily management of the University. Examples of the processes supported by operational databases include student registration, event scheduling, facilities maintenance, payroll processing, parking enforcement, and procurement. The data contained in operational databases are often volatile (subject to frequent updates), detailed, and generally have a short life span. From a database design perspective, data are generally normalized to the third normal form.
- Layer 2: Decision/Data Analysis Support
- Databases of the decision and data analysis support layer contain subject-oriented data critical for monitoring, assessment and planning. Examples of activities supported by databases of this layer include program monitoring, facilities planning, longitudinal research, program assessment, strategic planning, trend analysis, and forecasting. The data of this layer are often static (no or few updates), summarized or derived from operational data, and historical in nature with a relatively long life span. The database design strategies employed reflect the techniques underlying data warehouses and data marts, such as denormalized data structures and star schemas.
Data Administration Service Model Diagram
The diagram on the following page provides a graphic overview of the service model described in this handbook.
Data Administration Service Model
Glossary of Terms
- agent
- an individual to whom authority has been delegated by the steward or system sponsor to act in behalf of the steward or system sponsor.
The data steward is accountable for defining and ensuring data quality. However, the steward may delegate the task of data capture and maintenance to another individual. Serving as the steward's agent, this individual has the responsibility to ensure the data are complete and accurate. For example, in the K-12 education environment, the local school agency's Registrar is the steward for student personal attributes. However, each principal within the local school agency serves as the Registrar's agent in recording and maintaining the student personal attribute data.
- attribute
- a fact that is stored or encapsulated by an object (Yourdon & Argila, 9)
- context
- the collection of rules, meanings, conventions, and technologies at a given point in time that are relevant to an individual or organization and that are associated with data
- customer
- a person, group of individuals, or an organization that is the ultimate destination or receiver of services and products delivered by the enterprise
- data
- the facts about a person, place, thing, or event (Coleman & Furey)
An alternative definition offered by Newton and Wahl defines data as representations of "facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans" or by technology (B-3).
- data administration
- the organizational function that develops long-term conceptual plans for institutional data
- data steward
- an individual who has institutional responsibility and accountability for an entity and its attributes
Under the knowledge management model, the data steward has evolved into a domain steward. The concepts of accountability and responsibility are the same; however, a domain steward has broader scope.
- database
- a "collection of interrelated data, often with controlled redundancy, organized according to a schema to serve one or more applications" (Newton & Wahl, B-8)
- database administration
- the process that controls the "content, design, and use of one or more databases to avoid uncontrolled redundancies and to enhance development" of information technology solutions (Newton & Wahl, B-4)
- domain
- a set of knowledge and processes that share common properties within an enterprise
- domain model
- a plan that describes the strategy for managing a set of knowledge and processes that share common properties within an enterprise
- domain steward
- an individual who has institutional responsibility and accountability for a domain
The concepts of accountability and responsibility of the domain steward are similar to that shared by the data steward. However, a domain steward has broader scope.
- enterprise
- the set of organizational entities of which each is responsible for one or more institutional function or service
- enterprise domain architecture
- a plan that describes the aggregate strategy for managing knowledge as an institutional resource
- experience
- the collection of skills, insights, and lessons learned by an individual or by a group of individuals that are applied to information (Coleman & Furey)
- information
- "data in a context relevant to an individual, team, or organization" (Coleman & Furey)
Newton and Wahl offer an alternative, yet similar definition in which information is the "meaning assigned to data by means of the known conventions used in their representation" (B-10).
- information resource
- the collection of information "created manually or by automated means that an enterprise treats as a resource for decision making and problem solving" (Newton & Wahl, B-12)
- knowledge
- an awareness about a person, place, thing, or event that is acquired by an individual through the application of data, context, and experience
- knowledge asset
- the representation of knowledge in a formalized manner suitable for communication, interpretation, or processing by humans or by technology
- knowledge base
- a collection of knowledge assets organized in order to maintain a well-informed work force, boost productivity, gain competitive advantage, and achieve organizational goals (Stuart)
- knowledge management
- the "process by which individual learning and experience can be accessed, reflected upon, shared and utilized in order to foster enhanced individual knowledge and, thus, organizational value" (Coleman & Furey)
- object
- an "independent, asynchronous, concurrent entity which knows things, does work and collaborates with other objects to perform the functions of a system" (Yourdon & Argila, 17)
- process
- the method or work flow to accomplish an objective
- process partner
- a person, group of individuals, or organization that contributes or adds value to the product or service delivered to the customer
- service
- the work performed by an object (Yourdon & Argila, 9)
- steward
- an individual who is either a data steward, a domain steward or both
- system sponsor
- an individual who has the accountability for ensuring the security, integrity, and availability of an application and who has the authority to prioritize enhancements to the application
- technology
- the mechanisms employed to capture, use, present, or share data
Technology may assume any form from the simple to the complex. Examples of technology include the words and illustrations on a page of a document to an object-oriented, WEB enabled database containing text, audio, graphics, and video clips.
- tools administration
- the process that identifies, deploys, and controls the utilities that facilitate data management
Appendix A
Knowledge Domains for the University of Virginia
The Data Administration Use-Case (Table 1) is domain-centric in that the solutions deployed are targeted to support the needs of specific operational areas. However, the solutions deployed must remain consistent with the University's knowledge management philosophy that information must be "available to all employees who have a legitimate need."
As of June 1999, formal work on defining the University's knowledge domains has not begun. However, based on previous work with other educational institutions and on an overview of the University's current information resources, it is possible to define a "starter domain set". Thus, the knowledge domains described below are offered as an initial set from which architecture development may proceed.
Key:
- Knowledge Domain
- Scope Definition
Domain Set:
- Student
- Set of knowledge and processes that are associated with the delivery of services to prospective, registered, former, and graduated students
- Instruction
- Set of knowledge and processes that are associated with learning (the delivery of curriculum to students)
- Curriculum
- Set of knowledge and processes that are associated with programs of study and with teaching/learning methods
- Patient
- Set of knowledge and processes that are associated with the delivery of health care services to individuals
- Human Resource
- Set of knowledge and processes that are associated with the recruitment, assignment, and retention of individuals employed by the University
- Fiscal Resource
- Set of knowledge and processes that are associated with the acquisition, allocation, and monitoring of the University's monetary assets
- Infrastructure Resource
- Set of knowledge and processes that are associated with the acquisition, construction, use, maintenance and disposition of facilities and other capital assets
- Consumable Resource
- Set of knowledge and processes that are associated with the procurement, distribution, use, and disposition of consumable goods and services
- External Resource
- Set of knowledge and processes that are associated with "foreign" assets that support or contribute to the University's mission
- Knowledge Resource
- Set of data and processes that are associated with research and with the acquisition and distribution of information as an institutional asset
- Governance
- Set of knowledge and processes that are associated with the external environment that influences the University's mission and operations
Works Cited
Coleman, David and Furey, Deborah. "Collaborative Infrastructure for Knowledge Management (Part I)." On-line, January 21, 1997. http://www.collaborate.com/tip1096.html.
Newton, Judith J. and Wahl, Daniel C. Manual for Data Administration. Washington, D.C.: National Institute of Standards and Technology (NIST), 1993.
Stuart, Anne. "Reality Check--Knowledge Management." CIO Magazine. On-line, January 21, 1997. http://www.cio.com/CIO/06196_uneasy_l.htlm.
University of Virginia, ITC Data Stewards Committee. "Administrative Data Access Policy." On-line, March 30, 1999. http://www.itc.virginia.edu/department/committees/ds/policy.html.
Yourdon, Edward and Argila, Carl. Case Studies in Object Oriented Analysis and Design. Upper Saddle River, New Jersey: Prentice Hall, 1996.
