Data Services Handbook

Volume 1: Definition of the Data Administration Service Model

University of Virginia
Department of Information Technology and Communication (ITC)
Applications and Data Services Division (ADS), Data Services
June 1999

Contents

Acknowledgement

I would like to express my gratitude to the following individuals:

Deborah Mills
Manager, Enabling Technologies, University of Virginia
Terrell Jones
Database Administrator, Data Services, University of Virginia
Debbie Luzynski
Database Administrator, Data Services, University of Virginia
Gary Policastro
Manager, Domain Development, Fairfax County Public Schools, Virginia
Bethann Canada
Director, Management Information Systems, Virginia Department of Education
Ted Davis
Director, Knowledge Asset Management, Fairfax County Public Schools, Virginia

These individuals have contributed to the development of this handbook by sharing their experiences, insight, and/or feedback.

Ed Tyler, Manager, Data Services
University of Virginia
June 1999

Forward and Scope

The University of Virginia has embraced information as an institutional resource that is critical to its success. To protect the value of and to ensure accessibility to this resource, the University adopted the "Administrative Data Access Policy" in 1994. The cornerstone of the policy was the philosophy that information must be "available to all employees who have a legitimate need."

To nurture this philosophy and to leverage the University's investment in its information resources, Data Services has the responsibility to develop and implement effective and cost efficient data administration practices. Data Services also delivers services and products that assist the University community in locating, accessing, and employing institutional data. Finally, Data Services serves as the leading advocate for guiding the data access philosophy into one that fosters knowledge management.

Therefore, the purpose of this document is to serve as a handbook for Data Services. The handbook describes the objectives and standards adopted for data management. The handbook is divided into three volumes as listed below. This volume describes the philosophy underlying the objectives and standards adopted. As such, it establishes the context for the remaining volumes.

  • Volume 1: Definition of the Data Administration Service Model
  • Volume 2: Standards for Data Administration
  • Volume 3: Standards for Database Administration

Knowledge Management Framework for Data Administration

Every enterprise acquires and uses a variety of resources in order to achieve its goals. Knowledge is one such resource. Like any resource, knowledge has a life cycle spanning from creation to consumption and ending with obsolescence.

For purposes of data administration, knowledge is acquired when an individual applies his or her experiences, skills, and insights to information. To create knowledge, an individual integrates three components--data, context, and experience--as illustrated in Figure 1 below. To tap into the knowledge held by its employees, stakeholders and customers, an enterprise must foster an environment conducive to information sharing. Within such an environment, an enterprise has the means to leverage knowledge as an institutional resource.

Figure 1: Overview--Transformation of Data into an Institutional Knowledge Resource

Animated image illustrating how data is transformed into an institutional knowledge resource

The challenge lies in defining an effective and cost efficient strategy that enables such an environment. The current research and development for such solutions have focused on the area of knowledge management. Knowledge management is the "process by which individual learning and experience can be accessed, reflected upon, shared and utilized in order to foster enhanced individual knowledge and, thus, organizational value" (Coleman & Furey). The policies and strategies adopted to facilitate this process provide the framework for managing the components of knowledge.

Data administration facilitates the management of the data component of knowledge. Specifically, data administration is the organizational function that develops long-term conceptual plans for institutional data. Activities include strategic information resource planning, data standardization, data synchronization, and database development and maintenance. The primary products and services include the enterprise data architecture and the metadata--a catalog or index of the enterprise's knowledge resources. By defining, organizing, and protecting institutional data, the requirements of "controlling the acquisition, analysis, storage, retrieval, and distribution of data" are addressed (Newton & Wahl, 2-4, 2-6, B-3).

Data administration contains two related processes--database administration and tools administration. Database administration controls the "content, design, and use of one or more databases to avoid uncontrolled redundancies and to enhance development" of information technology solutions (Newton & Wahl, B-4). While the data administrator assumes the role of an architect, the database administrator builds the knowledge repository from the blueprints drafted. Thus, unlike data administration, database administration has a technical orientation that focuses on the physical databases deployed and on the tools used to manage, access, and manipulate data (Newton & Wahl, 2-4, 2-6).

Tools administration identifies, deploys, and controls the utilities that facilitate data management. Tools include software that assist in information modeling, database management, metadata management, data propagation, data access and control, and performance monitoring and tuning. The tools used must be consistent with the knowledge management strategies adopted.

Figure 2 depicts the umbrella-like effect of knowledge management on data management practices. The depiction is based on the concept described by Judith Newton and Daniel Wahl in their NIST publication, Manual for Data Administration (2-2).

Figure 2: Knowledge Management Influence

Hierarchically grouped umbrellas show relationship between knowledge management and data management
     practices

Data Administration Service Model

The primary goal of data management is to facilitate the systematic process by which knowledge is acquired and applied as an institutional resource. The remainder of this volume describes the University's data administration service model.

Service Provider

Data Services within the Applications and Data Services Division of Information Technology and Communication (ITC) delivers data management services and products to the University community.

Charter

The responsibilities for Data Services were defined in paragraph 4.10 of the Administrative Data Access Policy adopted by the University:

Data Management develops and applies standards for the management of institutional data and for ensuring that data are accessible to those who need it. The manager of the Data Management group chairs the Data Steward and Data Custodian committees and works very closely with these committees on formulation of data policies, standards, and procedures.

Data Management works with the Data Stewards to establish long-term direction for effectively using information resources to support University goals and objectives.

Data Management creates logical data models of applications. These models are ultimately used to create an institution-wide data model that cross-references data across applications and encourage data sharing.

Data Management develops a standard method for naming and defining data. It also facilitates conflict resolution in data definitions.

Data Management makes institutional data available to authorized users in a manner consistent with established data access rules and decisions. It develops views of data as directed by the data committees. The group ensures that the technical integrity of the data is maintained and, in conjunction with the ITC Security Team, that data security requirements are met."

Principles

The following principles guide the assessment and selection of the practices and standards adopted for data administration.

  1. Knowledge is critical to the success of the University.
  2. The needs of individuals who require knowledge are infinitely diverse.
  3. Knowledge must be transformed into a strategic, shareable institutional resource.
  4. The knowledge management practices adopted must be effective and cost efficient.

Objectives

Data Services must achieve and sustain the following objectives.

  1. Deploy strategies that transform knowledge into a strategic, shareable institutional resource.
    • Use a systematic approach to model information and to design databases.
    • Create an architecture that consolidates conceptual and physical data models to the information needs and functions of the University (Newton & Wahl, 4-1).
    • Promote active data stewardship with defined accountability.
    • Promote data consistency and standardization throughout the University by developing standards for names, definitions, values, formats, and metadata (Newton & Wahl, 4-1).
    • Minimize duplication in collecting, processing, storing, and distributing information (Newton & Wahl, 4-1).
    • Improve the quality, accuracy, integrity, and relevancy of institutional information. (Newton & Wahl, 4-1).
    • Improve knowledge management and access through the use of appropriate resources, methods, tools, and technologies (Newton & Wahl, 4-1).
    • Adopt effective and cost efficient data management tools and practices.
  2. Promote and/or deliver products and services that empower knowledge workers to access and apply institutional information.
  3. Educate the University.
    • Increase awareness of the value of institutional information.
    • Advocate the strategic value of knowledge management.
    • Encourage and facilitate knowledge sharing within the University and among its partners (Newton & Wahl, 4-1).
  4. Measure the quality, effectiveness, and cost efficiency of the products and services delivered.
  5. Require each member of Data Services to maintain currency of technical knowledge; sustain proficiency of skills; acquire a familiarity and an understanding of the University's functions; and to have an awareness of industry trends.
    • Provide opportunities to gain first-hand knowledge of the University's functions.
    • Provide technical staff development opportunities.
    • Accommodate self-motivated educational pursuits.
    • Establish collaborative partnerships with external entities.

Use-Case

The use-case for data administration is summarized in Table 1. The use-case provides an overview of the work involved by describing the primary processes; the products or services delivered; the customers or benefactors; and the partners involved. The data administration use-case has three high-level processes.

Table 1: Data Administration Use-Case

  1. Process

    Foster Data Administration Service Model

    Process Steps
    1. Identify stakeholders
    2. Assess environment, such as policies, practices, and technology, for managing institutional data
    3. Define and/or adjust vision, objectives, direction, architecture, and other products and services
    4. Obtain commitment and buy-in
    5. Communicate and advocate data management
    6. Monitor and evaluate service model
    Products and/or Services Delivered

    Data management:

    1. Direction
    2. Architecture
    3. Standards
    4. Life cycle
    5. Policies
    6. Methods
    Process Customers

    Enterprise

    Note: The enterprise is defined as the set of organizational entities of which each is responsible for one or more institutional function or service.

    The students (prospective, current, and graduated) and their families, patients, researchers, and the community (local and global) are not included within the enterprise as defined. Rather, they are viewed as customers of the University.

    Process Partners
    1. Applications and Data Services, ITC
    2. Communications and Systems, ITC
    3. Computing Support Services, ITC
    4. Operations, ITC
    5. Stewards

    Note: The business partners, foundations, and state and federal agencies are not process partners but are factors in the environment that impact knowledge management.

    The Vice President and Chief Information Officer and the senior managers of ITC are not process partners but are an approval authority.

  2. Process

    Define Domain-Specific Data Management Strategy

    Process Steps
    1. Identify stakeholders
    2. Assess data management problem, needs, and environment
    3. Develop business case (does not include problem solution definition)
    4. Develop strategy for data management
      • Identify stewards, agents, and system sponsors
      • Educate stewards, agents, and system sponsors
      • Define concept of operations (function and technology)
      • Define implementation approach alternatives
      • Evaluate and select implementation approach
    5. Attain commitment and buy-in (includes funding and other resources)
    6. Monitor and evaluate domain architecture
    Products and/or Services Delivered

    Domain architecture

    Notes: The products and services are the result of applying the data administration service model to a domain.

    Process Customers

    Domain

    Notes: A domain is defined as a set of knowledge and processes that share common properties within an enterprise.

    A brief description of the domains identified for the University is provided in Appendix A.

    Process Partners
    1. Applications and Data Services, ITC
    2. Budget and Administration, ITC
    3. Communications and Systems, ITC
    4. Computing Support Services, ITC
    5. Operations, ITC
    6. Stewards (including domain sponsor and interfacing domain stewards)
  3. Process

    Implement and Maintain Data Management Solutions

    Process Steps
    1. Develop detailed implementation plan, including tasks, resources, and schedules
    2. Execute plan
      • Develop business process
      • Define data
      • Develop and/or acquire tools
      • Establish databases
      • Develop and/or deliver training
      • Test solutions
      • Install capability
    3. Evaluate progress and adjust plans
    4. Monitor, evaluate, and maintain solutions
    Products and/or Services Delivered

    Data management capabilities, such as processes, tools, skills, databases, mechanisms, and metadata

    Process Customers

    Individuals who use or manage processes, tools, and/or data of the domain

    Process Partners
    1. Applications and Data Services, ITC
    2. Communications and Systems, ITC
    3. Computing Support Services, ITC
    4. Operations, ITC
    5. Stewards (including domain sponsor and interfacing domain stewards)
    6. Vendors
    7. Support teams such as LSP (Local Support Partner)
    8. Other entities such as state agencies (e.g. Virginia Retirement System) and foundations

Skill and Knowledge Set

To support the data administration processes described, the following set of skills and knowledge is required. The skills and experiences described are conceptual and do not imply a one-to-one correspondence with current University positions. Rather, the descriptions are provided for planning purposes.

  1. Enterprise Domain Architect Skill/Knowledge Set
    • Characterized as having an "Enterprise" or institutional orientation rather than technology focused
    • Ensures that the institution has the means to leverage its knowledge resources
    • Possesses in-depth knowledge of enterprise information resource management (knowledge management)
    • Develops and sustains the enterprise domain architecture (plan)
    • Establishes plans, strategies and standards for data/domain management, database management, data management tool administration
    • Introduces new industry practices for knowledge management
    • Has familiarity with knowledge management technologies but does not possess in-depth technical knowledge
    • Communicates knowledge management plans, strategies
    • Obtains commitment and buy-in
    • Fosters knowledge management program and model
    • Obtains resources
    • Prepares budget and project plans
    • Supervises technical staff
  2. Domain Architect Skill/Knowledge Set
    • Characterized as having a "Domain" or business area orientation
    • Ensures that the data management solutions deployed satisfy domain requirements and is consistent with the domain architecture
    • Possesses in-depth knowledge of knowledge requirements within specific domains
    • Provides project management services for deploying data management solutions
    • Develops conceptual models and data security requirements
    • Captures metadata
    • Has familiarity with database technologies
    • Possesses extensive experience and skills in problem definition, solution assessment
    • Excels in oral and written communication, team management, and collaborating with technical resources and with business area customers
    • Has familiarity with knowledge management philosophies
    • Has familiarity with system development lifecycle
    • Enforces established standards
  3. System Database Administrator Skill/Knowledge Set
    • Characterized as "DBMS Heavy" but "Application Light"
    • Ensures availability, integrity, and usability of database environment
    • Possesses in-depth knowledge of internal workings of DBMS software
    • Has extensive experience and skills with DBMS software
    • Installs database management software releases and patches
    • Monitors and tunes database instance for performance
    • Determines backup and recovery strategies from a DBMS perspective
    • Determines system resource requirements
    • Provides technical DBMS leadership and consulting service
    • Is aware of the applications accessing the database but does not have detail knowledge of application
    • Designs physical databases (new and enhancements)
    • Possesses knowledge and experience with database connectivity technologies
    • Possesses knowledge and/or experience with system administration and network administration
    • Understands impact of system and network changes on database environment
    • Establishes and sustains database security strategy
    • Serves as Application Database Analyst for applications not having a dedicated Application Database Analyst
  4. Application Database Analyst Skill/Knowledge Set
    • Characterized as "DBMS Light" but "Application Heavy"
    • Ensures the integrity and protection of data within the database from unauthorized access or misuse
    • Is aware of general workings of DBMS internals
    • Has sufficient skills to create and alter tables
    • Has sufficient skills to maintain database user security
    • Has sufficient knowledge to participate effectively in database design sessions
    • Possesses in-depth knowledge of application software
    • Has extensive experience and skills with database programming
    • Has extensive experience and skills with tools to sustain application
    • For commercial-off-the-shelf (COTS) solutions, has skills and knowledge to apply patches or install new releases
    • Has extensive knowledge of business area or problem supported by application
    • Populates database with data, including data transformations
    • Understands impact of business process changes on database

Data Management Tool Set

The tables on the following pages identify the current tools that are used to deliver or support data management services.

Table 2: Tools for Data Management Planning and Modeling

Tools Logical Modeling Metadata Administration
Oracle Designer Yes Yes
Platinum Technology ERwin Yes  
Platinum Technology Repository   Yes

Table 3: Tools for Deploying and Sustaining Data Management Solutions

Tools Physical Database Modeling Data Server Database Administration (Monitoring/Tuning)
Cincom Supra   Yes  
Embarcadero DBARTISAN     Yes
Oracle Database Server, Enterprise Edition   Yes  
Oracle Developer Yes    
Oracle Enterprise Manager     Yes
Platinum Technology ERwin Yes    
Sybase Adaptive Server Enterprise   Yes  

Table 4: Tools for Data Access, Query, Manipulation, and Analysis

Tools Database Connectivity Data Query Data Extract, Transform, Import On-line Analysis (OLAP)
Brio Query   Yes   Yes
Embarcadero DBARTISAN     Yes  
Embarcadero RapidSQL   Yes    
Oracle SQL*NET Yes      
Sybase JCONNECT Yes      
Sybase Open Client Yes      

Data Architecture

To satisfy the University's information requirements, databases are deployed according to a two-layer data architecture scheme. The architecture recognizes the differentiation that exists between the data that support operational processes and the data that support analysis and decision making. The architecture also recognizes the value-added service provided by implementing an active repository of metadata.

Layer 1: Operational Support
Databases of the operational support layer contain transaction-oriented data that are critical to the daily management of the University. Examples of the processes supported by operational databases include student registration, event scheduling, facilities maintenance, payroll processing, parking enforcement, and procurement. The data contained in operational databases are often volatile (subject to frequent updates), detailed, and generally have a short life span. From a database design perspective, data are generally normalized to the third normal form.
Layer 2: Decision/Data Analysis Support
Databases of the decision and data analysis support layer contain subject-oriented data critical for monitoring, assessment and planning. Examples of activities supported by databases of this layer include program monitoring, facilities planning, longitudinal research, program assessment, strategic planning, trend analysis, and forecasting. The data of this layer are often static (no or few updates), summarized or derived from operational data, and historical in nature with a relatively long life span. The database design strategies employed reflect the techniques underlying data warehouses and data marts, such as denormalized data structures and star schemas.

Data Administration Service Model Diagram

The diagram on the following page provides a graphic overview of the service model described in this handbook.

Data Administration Service Model

flowchart of Data Administration Service Model

Glossary of Terms

agent
an individual to whom authority has been delegated by the steward or system sponsor to act in behalf of the steward or system sponsor.

The data steward is accountable for defining and ensuring data quality. However, the steward may delegate the task of data capture and maintenance to another individual. Serving as the steward's agent, this individual has the responsibility to ensure the data are complete and accurate. For example, in the K-12 education environment, the local school agency's Registrar is the steward for student personal attributes. However, each principal within the local school agency serves as the Registrar's agent in recording and maintaining the student personal attribute data.

attribute
a fact that is stored or encapsulated by an object (Yourdon & Argila, 9)
context
the collection of rules, meanings, conventions, and technologies at a given point in time that are relevant to an individual or organization and that are associated with data
customer
a person, group of individuals, or an organization that is the ultimate destination or receiver of services and products delivered by the enterprise
data
the facts about a person, place, thing, or event (Coleman & Furey)

An alternative definition offered by Newton and Wahl defines data as representations of "facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans" or by technology (B-3).

data administration
the organizational function that develops long-term conceptual plans for institutional data
data steward
an individual who has institutional responsibility and accountability for an entity and its attributes

Under the knowledge management model, the data steward has evolved into a domain steward. The concepts of accountability and responsibility are the same; however, a domain steward has broader scope.

database
a "collection of interrelated data, often with controlled redundancy, organized according to a schema to serve one or more applications" (Newton & Wahl, B-8)
database administration
the process that controls the "content, design, and use of one or more databases to avoid uncontrolled redundancies and to enhance development" of information technology solutions (Newton & Wahl, B-4)
domain
a set of knowledge and processes that share common properties within an enterprise
domain model
a plan that describes the strategy for managing a set of knowledge and processes that share common properties within an enterprise
domain steward
an individual who has institutional responsibility and accountability for a domain

The concepts of accountability and responsibility of the domain steward are similar to that shared by the data steward. However, a domain steward has broader scope.

enterprise
the set of organizational entities of which each is responsible for one or more institutional function or service
enterprise domain architecture
a plan that describes the aggregate strategy for managing knowledge as an institutional resource
experience
the collection of skills, insights, and lessons learned by an individual or by a group of individuals that are applied to information (Coleman & Furey)
information
"data in a context relevant to an individual, team, or organization" (Coleman & Furey)

Newton and Wahl offer an alternative, yet similar definition in which information is the "meaning assigned to data by means of the known conventions used in their representation" (B-10).

information resource
the collection of information "created manually or by automated means that an enterprise treats as a resource for decision making and problem solving" (Newton & Wahl, B-12)
knowledge
an awareness about a person, place, thing, or event that is acquired by an individual through the application of data, context, and experience
knowledge asset
the representation of knowledge in a formalized manner suitable for communication, interpretation, or processing by humans or by technology
knowledge base
a collection of knowledge assets organized in order to maintain a well-informed work force, boost productivity, gain competitive advantage, and achieve organizational goals (Stuart)
knowledge management
the "process by which individual learning and experience can be accessed, reflected upon, shared and utilized in order to foster enhanced individual knowledge and, thus, organizational value" (Coleman & Furey)
object
an "independent, asynchronous, concurrent entity which knows things, does work and collaborates with other objects to perform the functions of a system" (Yourdon & Argila, 17)
process
the method or work flow to accomplish an objective
process partner
a person, group of individuals, or organization that contributes or adds value to the product or service delivered to the customer
service
the work performed by an object (Yourdon & Argila, 9)
steward
an individual who is either a data steward, a domain steward or both
system sponsor
an individual who has the accountability for ensuring the security, integrity, and availability of an application and who has the authority to prioritize enhancements to the application
technology
the mechanisms employed to capture, use, present, or share data

Technology may assume any form from the simple to the complex. Examples of technology include the words and illustrations on a page of a document to an object-oriented, WEB enabled database containing text, audio, graphics, and video clips.

tools administration
the process that identifies, deploys, and controls the utilities that facilitate data management

Appendix A

Knowledge Domains for the University of Virginia

The Data Administration Use-Case (Table 1) is domain-centric in that the solutions deployed are targeted to support the needs of specific operational areas. However, the solutions deployed must remain consistent with the University's knowledge management philosophy that information must be "available to all employees who have a legitimate need."

As of June 1999, formal work on defining the University's knowledge domains has not begun. However, based on previous work with other educational institutions and on an overview of the University's current information resources, it is possible to define a "starter domain set". Thus, the knowledge domains described below are offered as an initial set from which architecture development may proceed.

Key:

Knowledge Domain
Scope Definition

Domain Set:

Student
Set of knowledge and processes that are associated with the delivery of services to prospective, registered, former, and graduated students
Instruction
Set of knowledge and processes that are associated with learning (the delivery of curriculum to students)
Curriculum
Set of knowledge and processes that are associated with programs of study and with teaching/learning methods
Patient
Set of knowledge and processes that are associated with the delivery of health care services to individuals
Human Resource
Set of knowledge and processes that are associated with the recruitment, assignment, and retention of individuals employed by the University
Fiscal Resource
Set of knowledge and processes that are associated with the acquisition, allocation, and monitoring of the University's monetary assets
Infrastructure Resource
Set of knowledge and processes that are associated with the acquisition, construction, use, maintenance and disposition of facilities and other capital assets
Consumable Resource
Set of knowledge and processes that are associated with the procurement, distribution, use, and disposition of consumable goods and services
External Resource
Set of knowledge and processes that are associated with "foreign" assets that support or contribute to the University's mission
Knowledge Resource
Set of data and processes that are associated with research and with the acquisition and distribution of information as an institutional asset
Governance
Set of knowledge and processes that are associated with the external environment that influences the University's mission and operations

Works Cited

Coleman, David and Furey, Deborah. "Collaborative Infrastructure for Knowledge Management (Part I)." On-line, January 21, 1997. http://www.collaborate.com/tip1096.html.

Newton, Judith J. and Wahl, Daniel C. Manual for Data Administration. Washington, D.C.: National Institute of Standards and Technology (NIST), 1993.

Stuart, Anne. "Reality Check--Knowledge Management." CIO Magazine. On-line, January 21, 1997. http://www.cio.com/CIO/06196_uneasy_l.htlm.

University of Virginia, ITC Data Stewards Committee. "Administrative Data Access Policy." On-line, March 30, 1999. http://www.itc.virginia.edu/department/committees/ds/policy.html.

Yourdon, Edward and Argila, Carl. Case Studies in Object Oriented Analysis and Design. Upper Saddle River, New Jersey: Prentice Hall, 1996.

© 2008 by the Rector and Visitors of the University of Virginia.

The information contained on the University of Virginia’s Department of Information Technology and Communication (ITC) website is provided as a public service with the understanding that ITC makes no representations or warranties, either expressed or implied, concerning the accuracy, completeness, reliability or suitability of the information, including warrantees of title, non-infringement of copyright or patent rights of others. These pages are expected to represent the University of Virginia community and the State of Virginia in a professional manner in accordance with the University of Virginia’s Computing Policies.