*****

     As of 4/11/2007, HSM service is no longer provided by TSM
     and an IBM tape robot.  HSM service is now provided by an
     ADIC StorNext filesystem.  The information below should
     only be used for historical reference.

                    *****

Hierarchical Storage Management at ITC   (last updated 7/5/2006)

Contents

Overview

In 1999 ITC began supplying Hierarchical Storage Management (HSM) services to UVa clients in order to provide large amounts of permanent storage space in a cost-effective manner.

User accounts are not available on the HSM for general computing use. Instead, accounts are established as needed to provide network access by remote machines to data stored in the HSM. Data access by other machines is via NFS protocol, typically for Unix machines, or NetBIOS/SMB/CIFS protocols for NT or Windows machines.

Hardware

The HSM hardware includes an IBM 3494 tape robot, along with an IBM RS6000 F50 to control the 3494. The F50 currently runs Tivoli Storage Manager (TSM) 5.1.6.3 on AIX 5.1ML2.

The 3494 contains 204 data tapes, each with a 40 Gb (gigabyte) native capacity, for a total capacity of somewhere between 8 Tb (terabyte) and 24 Tb, depending on data compression. Tapes are stored and retrieved from storage cells by a robotic device which identifies tapes by laser scans of bar code labels. The robot hardware is controlled by an IBM PC inside the 3494 cabinet which in turn interfaces with the TSM software on the RS6000. The 3494 also contains two IBM 3590 model E1A tape drives. More details about the IBM 3494 tape robot can be found here. Installation and administration of TSM is described in the publicly available PDF and HTML manuals. The HSM is administered by the ITC Unix Systems group.

User files are stored in IBM SSA (RAID 5) disk arrays to prevent data loss due to disk failure.

Proper Use of the HSM

The HSM should be used to store a reasonable number of archives of infrequently-accessed files. Any other use can cause poor performance of the HSM.

The maximum size of a file in the HSM is about 2,147,483,647 bytes (2 Gb). An attempt to create a file larger than this fails with an I/O error.

The HSM should not be used to store large numbers of files with the expectation that all or most of them can be retrieved in a reasonable amount of time. It can take a significant amount of time to recall a file. The amount of time depends on whether the file resides in its original space-managed filesystem, or a copy must be retrieved from the disk cache, or from tape. The older a file is, the more likely it is to be resident on tape, requiring the longest recall time. In a test, 6 small files were recalled in rapid succession, requiring 15, 37, 106, 223, 105, and 105 seconds, or an average of about 100 seconds per file.

Suppose a project which produced a report has terminated. All 1000 data files used by the project are copied to the HSM and are then removed from local disk. Weeks later, a researcher finds that some important numbers are missing from the report. The project must be temporarily revived to regenerate the report. Unfortunately, the researcher would have to wait over a day to recall all 1000 files. Even larger numbers of files would cause proportionally more delay, and the assumption that a tape drive is continuously available for recalls during the required extended time period, may not be valid.

Rather than storing a large number of files, the files should be combined into a few archive files, each smaller than 2 Gb, using a utility such as tar or cpio. The archive files should then be stored. When it is time to recall a collection of files, the archive files can be recalled relatively quickly, and the desired files extracted.

The HSM should not be used to store files that are used regularly. Otherwise, a recall operation may often be necessary. Unproductive "wars" between applications can occur as each application attempts to use more space than is available in a filesystem.

Do not expect to be able to retrieve arbitrarily large amounts of data and have all the data accessible at the same time. The size of each space-managed filesystem in which users may store files is 64 Gb. If 100 Gb of data is stored in 50 files of 2 Gb each, the retrieval of one or two files would not be a problem. However, if an attempt is made to retrieve all the files, 32 could be retrieved successfully (assuming no other user is concurrently retrieving files), but as the filesystem fills up, the retrieval of the 33rd would cause one of the first 32 to again be migrated to the disk cache (or even to tape), the retrieval of the 34th would cause one of the first 33 to be migrated, etc. The end result would be that only 32 of the files would be resident in the filesystem.

The proper way to deal with the recall of the 100 Gb is to access only a few (at most 32) of the 2 Gb files at a time. If it really is necessary to have all 100 Gb in the filesystem for simultaneous use, then the recalled files must be copied to another machine, again a few at a time, since the filesystems in the HSM are not large enough to accomodate the application.

Data Storage and Access

Users on remote machines create files within space-managed filesystems on the RS6000. If available space in one of the filesystems becomes low, files are copied (migrated) to a large disk cache until enough space is made available. The cache is emptied to tape as needed. When an attempt is made to access a previously migrated file, HSM software automatically retrieves (recalls) the file contents either from the disk cache or from tape.

Data retrieval (recall) is automatically initiated when a migrated file is accessed. The recall may take several minutes if the file must be copied from tape; in this case, a tape must be mounted by the robot, the tape must be positioned, and the file must be copied to disk.

On some machines, it is possible for access to a migrated file to fail. For example, on Solaris you may see the message

      file temporarily unavailable on the server, retrying...
A recall operation is initiated but the accessing program does not wait for the operation to complete. Such failures may be eliminated by causing a recall to be done before the program is run.
    /common/uva/bin/hsmread  /path/to/file1  /path/to/file2 ...
may be used to repeatedly attempt to read the last byte of each file until the read is successful, at which time it is guaranteed that the entire file is disk resident. While this should eliminate most failures, it is possible that after a file is recalled, it is again migrated because other large files have subsequently been recalled.

When disk space becomes low and a file is migrated to the disk cache or to tape, the file is replaced with a "stub file". A stub file appears to Unix commands such as ls to be the original file; it is actually a small file containing the first 4095 bytes of the original file. If an attempt is made to access more than the first 4095 bytes, a recall is performed, and the original file contents are restored.

Filesystems

Three space-managed filesystems are available to remote users. The first (20 GB) contains files which are maintained by ITC staff, and which can be read, but not modified, by remote machines. The filesystem may use up to 600 GB of migration storage.

Files in the other two filesystems (64 GB each) may be modified by remote machines. NFS protocol is subject to security problems; to minimize the potential for problems, one of the filesystems is accessible only to ITC machines. These are machines to which only ITC staff has root access. The other filesystem is accessible to non-ITC machines. Each filesystem may use up to 1.2 TB of migration storage.

Backup of HSM Data

In order to preserve user data in case of hardware failure or accidental removal, files written into an HSM-managed filesystem are backed up nightly to an offsite location. The HSM has been configured to require the existence of a backup of a file before the file can be migrated from the filesystem. A downside to this requirement is that it limits the rate at which data can be written into the filesystem: since an HSM filesystem can contain only 64 Gb at a time, no more than 64 Gb can be written into it before a backup must be done. Backups are done once per day, so a maximum of 64 Gb of new or modified data can be put into the filesystem in a day. Exceeding this rate causes all space in the filesystem to be exhausted until the next backup is done.

The following is a typical scenario: From a remote machine, file F is created or modified within an HSM-managed filesystem. That night, F is backed up offsite. Subsequently, if filesystem space is needed, F can be migrated to the disk cache or to tape. Once the backup is performed, two copies of F exist: one on disk, the other offsite. In addition, duplicate copies of migrated files are maintained in the HSM, so soon after F is migrated three copies of F exist. If a file is accidentally removed, it can be restored from the backups. Send e-mail to systems@Virginia.EDU if a file restore is needed.

Many Unix filesystems on machines administered by ITC are backed up with the ITC Unix backup software. The HSM does not use this software; commercial Tivoli backup software is used instead, which provides the following:

Point-in-Time Restores

Typical Unix filesystems are high-speed multipurpose storage areas. Files are instantly available, may appear, disappear or change frequently and unpredictably, and may be of vastly different sizes. Because the nature of a filesystem includes change, and some changes are undesirable in retrospect, the ability to restore files to their state at a certain point in time is important. Point-in-time restores are an integral part of the ITC Unix backup software for typical Unix filesystems. A system of periodic full and multilevel incremental backups allows the recovery of files as they existed at discrete times, with granularity becoming coarser as time into the past increases.

In an HSM-managed filesystem, modifications of files can be slow and awkward because they may require retrievals from tape. Furthermore, storage of many small files is not advised because a separate tape access is needed for the retrieval of each file. An HSM filesystem is therefore best suited to simply storing static large files. If the files stored are static, and only new files are added, point-in-time restores are not needed. In reality, occasional modifications or removals may be expected. However, the frequency of such changes, and the percentage of total HSM data affected, is expected to be low. Consequently, backups for point-in-time restores would only be needed at long intervals, and one may question the need for them at all. Tivoli software can be made to produce long-term backups for point-in-time restores. However, producing the backups is resource-intensive and adds new administrative requirements. Perhaps most significantly, it is not feasible given current resources. Therefore, point-in-time restores for the HSM are only available for the past 30 days.

Further Information

Click here for more information about the HSM, such as how to request space, or establishing remote access.

© 2008 by the Rector and Visitors of the University of Virginia.

The information contained on the University of Virginia’s Department of Information Technology and Communication (ITC) website is provided as a public service with the understanding that ITC makes no representations or warranties, either expressed or implied, concerning the accuracy, completeness, reliability or suitability of the information, including warrantees of title, non-infringement of copyright or patent rights of others. These pages are expected to represent the University of Virginia community and the State of Virginia in a professional manner in accordance with the University of Virginia’s Computing Policies.