[Jun 28, 2008 8:06] Email phishing scam targeting WebMail users has been reported. More Information
Table of Contents
GEOSS Components and Description
GEOSS has a relational database, a web interface, and an analysis suite. There is also a repository for each user's files, and a comprehensive security system. The gene expression workflow begins with a researcher describing an experimental protocol in a minimal fashion. When samples are ready, the researcher creates an order for the microarray research center which at the University of Virginia (UVa) is the Biomedical Research Facility (BRF). The BRF hybridizes the samples as specified by the researcher, and imports the resulting data into GeneX. At this stage the data is available to the researcher. Data can be exported and/or analyzed. Our Analysis Tree package allows users to build a flow chart (yes, there is a graphical flow chart on the screen) of routines to analyze data and produce reports. Except for system administrators, all interaction with the system is via secure web pages. No special software is required for end users. All data is warehoused on powerful, secure servers.
Data is loaded by curators (personnel in the microarray center) via a web interface. Data is visible only to the researcher who owns it, unless the researcher explicitly allows other permissions. The security system allows all users to create groups, to control group membership, and to enable/disable group read and/or write permissions. Permissions apply separately to studies/experimental conditions, orders, data, and derived data (files).
Our Analysis Tree allows the user to build a graphical representation of the flow of data through various modules of an analysis. This system will be documented elsewhere.
The servers are running Fedora Linux. GEOSS is written in Perl. Most of the analysis routines are written in R, some in C, and some in Perl. We use PostgreSQL as our relational database. Its abilities to do transactions, and its high availability were critical to the project. Users must login to the system. All the web pages are accessed via SSL so that all data traveling between the user's web browser and the server is secure from eavesdropping.
Several Perl modules that aren't standard in Perl 5.8 are required. R is required, and of course Apache. All of the Perl used by GEOSS amounts to only 11,000 lines of code. GEOSS is compact. Installation has been vastly simplified and should only take a few hours.
While GEOSS is theoretically portable to Windows NT, 2000, or XP, there are a few aspects that would be nontrivial.
History of GeneX
There are currently three parallel GeneX projects. All three systems are composed of a relational database, a web-based user interface, and an analysis system. All three teams interact on a regular basis, and share as much technology as possible. We also get help from several other groups for additional analysis modules and testing. The three projects are:
- GeneX-Lite
- GEOSS (formerly known as GeneX 1.x and GeneX Va)
- GeneX 2
GeneX began as a project at the National Center for Genome Resources (NCGR). Their web site is located at: http://www.ncgr.org/
The first useful version was GeneX 1.4, which NCGR released to the public domain. An open source project was started and hosted via Sourceforge. A smaller open source project continued at NCGR, and has become GeneX-Lite.
A small team here at the University of Virginia tried to rush GeneX 1.x into production, but we found it necessary to make extensive changes. GeneX 1.x has been renamed GeneX Virginia (or the shorter GEOSS).
In the meantime, Caltech and Open Informatics began work on GeneX 2 by modifying the 1.4 schema to have a high degree of MAGE compliance.
GeneX-Lite
GeneX-Lite is NCGR's latest contribution to the NSF grant effort. The design of GeneX-Lite is based on lessons learned from the GeneX 1.x system. The primary problems with the prototype system were mainly with the Curation Tool, system installation and data loading. Many parts of the GeneX 1.x system were good such as the web interface and integration with the analysis tools. We are currently working with the TIGR Multiple Experiment Viewer to provide an interface to GeneX-Lite. This tool has many normalization and analysis tools built in and provides a simple interface to add more.
GeneX-Lite was built with a simple data loading mechanism as the heart. This data loading mechanism can process tab-delimited files of arbitrary format. The storage of the data in the database is much more efficient than was the case for GeneX 1.x. Consequently, data loading with GeneX-Lite is much quicker than with GeneX 1.x.
Annotation is very generalized in GeneX-Lite. Annotation of any kind may be attached to Experiments, Array Layouts, Array Measurements, Features and Measurement Factors. The annotation mechanism is currently being enhanced to provide the controlled vocabularies similar to the original GeneX 1.x system. When this is accomplished correctly, the system will be MIAME and MAGE compliant.
Installation of the GeneX-Lite system is much easier than the original system. Minimal configuration is required to have GeneX-Lite up and running.
Another important feature of GeneX-Lite is that the core functions are all accessible via the command line. This means that repetitive loading and annotation could be automated through the use of scripts. The core functions are also wrapped with a user-friendly GUI.
A web interface is being developed as well as integration with analysis and visualization tools. Enhanced security is also planned. The current GeneX-Lite works with both Postgres and Oracle databases. Porting to another database would not be difficult due to the simple and flexible design of GeneX-Lite.
GeneX-Lite is supported on Solaris, Linux, Windows and MacIntosh (OS-X).
This information, the latest version of GEOSS, instructions, and more can be found on the GEOSS Home Page.
Please email comments or suggestions about this page to achs-secretary@virginia.edu.
Academic Computing Health SciencesBox 800555
Charlottesville, VA 22908
(434) 982-4025
