|
|
Introduction to S-PLUS – Part I |
|
|
Research Computing Support Group
|
1.1 The Various Windows That You Will Use
2 Opening Existing S-PLUS Data File
4 Defining Labels and Declaring Missing Values
5 Generating Simple Summary Statistics
6 Comparing Gender Differences in Mean Salary
10 Generating a Cross-Classification Table
12 Generating a Simple Linear Regression
15 Appendices
15.2 Statistical Computing at UVA
The two primary windows you will use in S-PLUS are the S-PLUS Commands Window and the S-PLUS Object Explorer. The Commands window allows you to access the powerful S-PLUS programming language. You can modify existing functions or create new ones tailored to your specific analysis needs. Using the customization features of S‑PLUS, any function may be executed from a dialog that is invoked by a menu item or toolbar button.
To have these two windows be opened each time you start S-PLUS you’ll want to modify your general settings of S-PLUS. The steps are below as well as a screen capture showing you how to open up the OPTIONS menu and select the two windows to be opened at Startup.


This will get these two windows opened the next time we start S-PLUS, but how do we open them now? You can do it in two simple steps. For the Object Editor window:
1. Click on the FILE menu, then chose NEW, and in that menu, select “ObjectEditor”, and either double-click on it or click the OK button after you’ve highlighted it (selected it). The picture below shows this menu choice highlighted.

2. To open a Command Windows, move your mouse pointer over to the Window menu choice, click on it, then point and click on the “Command Windows” choice. This will open up a Command Window. The picture below shows this being selected.

Nota Bene: You can open these two windows in any order. It doesn’t matter which you open first.
The S-PLUS environment is object-oriented. This means that everything in S-PLUS is a distinct, editable object. Some of these objects such as data sets and functions are automatically stored by S-PLUS in internal databases. Other types of objects only exist in the current session and must be saved in order to be stored permanently. Graph sheets and scripts are examples of objects that must be saved to disk or they are “lost” at the end of the session.
Chapter 2 of The S-PLUS Programmer’s Guide for Windows provides an overview of the Data Objects in S-PLUS.
The class of an object specifies how that object is represented in S-PLUS, what actions are permitted on that object and how those actions are performed. The most common classes of objects are numeric, character, factor, list and data.frame. Some of the most frequently used object types in S-PLUS are:
· vectors - ordered strings of data values. All data must be of the same mode i.e., all numeric, all character, all logical, etc..
· matrices - rectangular arrays, with rows and columns, of like-mode data.
· data frames - rectangular arrays. Columns and rows have identifying names. Columns which correspond to variables may be of different modes.
· lists - ordered collections of objects which may be of different types and modes. For example, a data frame, matrices, and vectors may be joined to form a list.
The Object Explorer is the S-PLUS interface for manipulating and visually organizing objects into a meaningful structure.
To see the sample data objects, perform these steps:


The Object Explorer provides a detailed map of S-PLUS. This two-paned window displays the data sets, graphs, functions, and other objects in your S-PLUS session. The left pane shows a hierarchical tree view of the objects in the current session. The right pane displays the objects that are contained in a left pane selection—just as in the Windows 95 Explorer. By using the Object Explorer, you can easily select data, functions, and objects to simplify the preparation of your analysis.
Double-click on “air” to view the air data. This opens the air data in a new S-PLUS window, the Data Window. In S-PLUS the Data Window is the primary tool for viewing and manipulating data.
The S-PLUS Data window resembles a spreadsheet. Data can either be typed directly into a new Data window or you can open an existing data set into the window. S-PLUS comes with several sample data sets stored in internal databases.
The Data window displays data sets in an editable spreadsheet format. It handles data in a column-oriented manner. Data can be edited within the Data window. Columns can also be copied or moved from one Data window to another. This allows you to easily manipulate your data for a wide variety of operations and analyses.
When a Data window is opened, the Data window toolbar is automatically displayed under the standard toolbar. The Data window toolbar contains buttons for editing commands frequently used when working with data. Sliding the mouse over the icons presented in the Data Window toolbar causes a popup message to appear which gives a brief explanation of the function for that window.

Choosing File from the standard menu, and then New allows you to open one of several types of S-PLUS Windows.

The Script window is designed for creating, modifying, and running scripts of S-PLUS commands. Script windows are an alternative to the Commands window. The Commands window is interactive, and commands typed in the window are immediately evaluated through the interpreter with the output shown below each command. The Script window on the other hand, lets you type a set of commands and functions, and only evaluates them on demand.
The Report window is similar to the Script window. Both are primarily text windows which can be opened and saved via the File menu and are editable. Unlike the Script window, the Report window does not deal with programs or scripts. The Report window is a place-holder for the text output resulting from any statistical operation. Error messages and warnings are sometimes placed in a Report window. When the Commands window is closed and a dialog is launched, output is directed to the Report window. Text in the Report window can be formatted before cutting and pasting it into another application.
The Chapter utility in S-PLUS creates a working directory that contains
data you may want to import to or export from S-PLUS. It contains a .Data directory to hold data objects,
metadata objects, and help files.
Start the Chapter utility by clicking on File, Chapters, and Attach/Create Chapter. Enter C:\temp for the Chapter Folder that you want to place in the search path. If the Chapter Folder doesn't exist, you can create one and attach it in this dialog.
temp is used as a default label for the folder. This is the name that will appear in the Object Explorer SearchPath. You can also specify the Position where you want to attach the Chapter Folder. Set the Position to 1.

The first way to enter commands is by using the pull-down menus and toolbar buttons. These are easy to use and have Help options built directly into each procedure.

The second way to issue commands is by using the Commands Window in S-PLUS. The Commands Window is much like the Scripts Window which we will use later except it is interactive. In conjunction with menu options it is useful in exploring data. The Commands Window can be invoked by clicking on the icon in the standard toolbar. If the window is in the background, it can be brought to the foreground by clicking on the icon twice.

We will work more with S-PLUS commands in the second session. A simple way to use S-PLUS is as a calculator. As shown in the picture below, you can have S-PLUS add “2+3” or give you the cosine of 4. Try it:
1. Click on the Commands Window. At the S-PLUS “>” greater than prompt, type in: 2+3 and press enter. S-PLUS will give you the result.
2. At the new S-PLUS prompt, now type cos(4) and press enter. Again, the result is displayed. This S-PLUS command demonstrates a common trait of S-PLUS commands and functions. They consist of some keyword and parenthesis.

The Commands window is often used in conjunction with text copied from the History log or History commands window. The History windows are also excellent resources for obtaining prototypical code for script files.
The third way to enter commands is to issue text commands from within the S-PLUS Script window. Each Script window has two panes. The upper pane is the "program" pane in which you can enter or copy commands. The lower pane is for script output. When you run your script, all output, such as Print commands and warnings and errors, appears in this pane. Text commands can come from one of three places:
We will work more with the Script window in the second session. For now here’s a sample Script window:

In order to answer our question about the local bank, we will need data about the people who work for the bank. This data is already contained in an S-PLUS data file called bank.sdd.
To download the files for the class to your local machine, perform these steps for each file. First, right-click your mouse on the file name below and select the option to "Save Target As" (using Windows Internet Explorer). Save the file to the local directory C:\temp.
These files are currently http://www.itc.virginia.edu/research/splus/training/splus6
Once you have save the files on your local machine, open the bank data file (bank.sdd) in S-PLUS as follows:
1. On the menu bar, click on FILE, and then click on “OPEN…”, which is the second choice on this menu.
2. Choose the file you want to open by browsing to it and click open. You may need to use the standard Windows navigation aids in this window to navigate to the C:\TEMP directory where you saved this data file.
A data set is made up of observations or cases. One case or observation is the basic unit of all data one wishes to analyze. A case consists of all the different data values for a particular subject, animal, time point, etc. Variables are made up of the data values that describe a particular characteristic for all of the cases. How you arrange your data and what you decide is the unit of analysis (the case or observation) is critically important! The questions you can ask and the analyses you can perform to get the answers are in large part determined by how you arrange your data.
S-PLUS lets you edit your data sets as columns of
information that can be displayed in Data windows. You can have many different
Data windows, each displaying a different data set. A Data window is similar to
a spreadsheet, but is column-oriented rather than cell-oriented, meaning that
most of the operations work on columns as units. Data windows provide access to
powerful features for editing and transforming data.

This example illustrates how cases and variables are organized in the S-PLUS Data window. Cases appear in the rows of the Data Window; in this example, each row represents a bank employee. Variables appear in the columns of the Data Window, with the variable name at the top of the column. Note that S-PLUS function words and other reserved words cannot be used as variable names. Variable names must start with a letter and may contain any combination of letters, numbers and periods. The first case is a male born in 1952, with 15 years of education, earning $57K in job category 3, et cetera.
In addressing our research question, the variables that we would first like to look at are GENDER and SALARY. These are shown above in the second and sixth columns of the Data Window.
A basic procedure for taking a summary look at a variable is to look at the number of cases associated with each value of a variable (or the frequency of each value for a variable in the data set). The Crosstabulations procedure in S-PLUS generates a table with this information. Let's generate a frequency table for the variable GENDER:


The output from this procedure is the following:
*** Crosstabulations ***
Call:
crosstabs(formula = ~ Gender, data = bankdata, na.action = na.fail, drop.unused.levels = T)
474 cases in table
+-------+
|N |
|N/Total|
+-------+
Gender |
| |RowTotl|
-------+-------+-------+
1 |255 |255 |
|0.54 |0.54 |
-------+-------+-------+
2 |215 |215 |
|0.45 |0.45 |
-------+-------+-------+
99 | 4 |4 |
|0.0084 |0.0084 |
-------+-------+-------+
ColTotl|474 |474 |
|1 | |
-------+-------+-------+
There are several problems with this GENDER variable. First, it is not obvious which category of GENDER represents men and which women. Second, there are three categories (e.g. 1, 2, and 99), but only two genders (e.g. male and female). To interpret this frequency table, we would have to rely on a codebook that identified the meaning of each category. The coding for GENDER in a codebook would look something like this:
|
GENDER: |
You should add this type of information about each variable to the data file, so that it is always available and so that your output is more easily interpretable. Adding such information involves defining variable labels and missing values, which is what we'll do next.
A good data set will include variable labels that provide a fuller description of the variable. Variable labels provide description for variables. In our example, we may want to give a fuller description of the variable GENDER such as "Gender of the respondent."
Let's create a variable label for GENDER. The steps are shown here:

The description will appear as a ToolTip when you pause the cursor over the column name.

Any variable for which a valid value cannot be read from raw data or computed is assigned the system-missing value. S-PLUS fills the empty cells with “NA”.
For some types of variables (especially continuous variables), we will want to obtain summary statistics other than the number of cases in each category of the variable. For example, we might be interested in the mean, median, or standard deviation of a particular variable. The variable SALARY has too many values for a frequency table to have any meaning, but we would be interested in knowing things like the mean of SALARY and the highest and lowest values as an aid to knowing these whether the data are entered correctly. S-PLUS can generate such information through the Summary Statistics procedure:



The output from the summary statistics procedure is the following:
*** Summary Statistics for data in: bank ***
SALARY
Min: 15750.00
Mean: 34419.57
Max: 135000.00
Total N: 474.00
NA's : 0.00
Std Dev.: 17075.66
While the Summary Statistics procedure gives us a general picture of the SALARY variable, what we're really interested in is the difference in salaries between men and women. S-PLUS can also give us with the means for various groups using the Summary Statistics procedure. Let's see how mean salaries differ for mean versus women:
1. From the menu bar, select Statistics – Data Summaries - Summary Statistics.
2. Highlight the variable SALARY in the Variables window.
3. Highlight the variable GENDER in the Group Variables window. Click “Statistics” on the menu of the Summary Statistics window.
4. In the Statistics dialog box, check the boxes beside the mean, number of rows and number of missing rows.
5. Click on "OK" to close this
dialog box and submit the command request to S-PLUS.

The output from this procedure should be as follows:
*** Summary Statistics for data in: bank ***
GENDER:1
SALARY
Mean: 41342.67
Total N: 255.00
NA's : 0.00
Std Dev.: 19541.93
------------------------------------------------------
GENDER:2
SALARY
Mean: 26045.558
Total N: 215.000
NA's : 0.000
Std Dev.: 7572.996
------------------------------------------------------
GENDER:99
SALARY
Mean: 43175.00
Total N: 4.00
NA's : 0.00
Std Dev.: 18609.56
As you can see, there is indeed a difference in salaries between men and women within the bank: Men make almost $15,000 more than women, on average $41K vs. $26K). Possible explanations for this difference might include education, previous job experience, age, length of tenure with the bank, and sexual discrimination.
Let's evaluate the plausibility of education as an explanation of the difference in salaries between men and women: Is there a difference in the percentage of males and females who have completed high school and college?
To answer that question, let's start by looking at the actual education variable already in our data set. Look at the variable EDUC in the Data Editor window. What type of variable is EDUC? Is the current variable coded according to whether the individual has completed high school and college? No. The current variable indicates the number of years of education completed by the respondent.
We can, however, change this variable so that only three categories are present: less than 12 years of education (which we infer means the respondent did not complete high school), 12 years or more (from which infer at least a high school diploma), and 16 years or more (from which we infer that the respondent has a college degree). This process of regrouping values is called recoding, and that's what we'll do next.
Variables can be recoded in one of two basic ways.
|
Into New Variable – Generally Preferred: |
Creates a new variable with new values based on the values of the original variable. In our example, a new variable would be created with three values (less than high school, high school degree, college degree or higher). We will name the new variable EDUC2. The value of the new variable would be based on the value of the original variable EDUC. The original variable EDUC would remain in the data set and its values would continue to represent the number of years of education completed by the respondent. |
|
Into Same Variable: |
Overwrites the original variable, replacing the original values with the new values that you have specified. In our example, the old values of EDUC, which represent the number of years of education completed by the respondent, would be replaced by the new values (less than high school, high school degree, college degree or higher).
|
Unless you have some compelling reason to do otherwise, it
is almost always better to Recode Into a New Variable. Doing so preserves the
original values should you ever want to use the original values or recode in
another manner.
When analyzing data, we often want to view quantitative or numeric data by way of categories. Right-click on the column for variable EDUC. Select Properties. You should see a dialog like this:

EDUC is represented in a double precision column and is a quantitative (numeric) variable. We selected Properties to get information on the EDUC variable - not make changes to EDUC. Click Cancel and no changes will be made.
Now let's recode the variable EDUC into a new categorical variable, EDUC2. The menu in S-PLUS distinguishes between grouping through creating categories and value by value recoding:. We will use the Create Categories method:
1. From the menu bar, select Data – Create Categories

By default the data in the EDUC2 column appears as a string in terms of the left and right delimiters. In the expression 11+ thru 15, 11+ does not include 11. Here is how the new column EDUC2 appears:

Suppose we wish to identify categories with levels 0,1, and 2 by making the three following changes or recodings:
0+ thru 11 to 0
11+ thru 15 to 1
15+ thru 21 to 2
We would like to use the menu’s Recode dialog three times for the recoding, but this simply does not result in the proper recode and no error message indicates the correct action. Click here for details on this bug.
We can, however, right click on the column for EDUC2, select properties and
edit the factor levels to "0", "1", "2".
Both the creation of EDUC2 and the desired labels for levels can also be completed in the command window with:
> bank$EDUC2 <-factor(cut(bank$EDUC, breaks=c(0, 8, 15, 21)), labels=c("0","1","2"))
Now let’s generate some statistics to check that the new variable looks correct. Use the standard menu to run a crosstabulation on the variable EDUC2. The Crosstabulations procedure should yield these results in a new report window:
*** Crosstabulations ***
Call:
crosstabs(formula = ~ EDUC2, data = bank, na.action = na.fail, drop.unused.levels = T)
474 cases in table
+-------+
|N |
|N/Total|
+-------+
EDUC2 |
| |RowTotl|
-------+-------+-------+
0 | 53 |53 |
|0.11 |0.11 |
-------+-------+-------+
1 |312 |312 |
|0.66 |0.66 |
-------+-------+-------+
2 |109 |109 |
|0.23 |0.23 |
-------+-------+-------+
ColTotl|474 |474 |
|1 | |
-------+-------+-------+
Now let’s recode GENDER variable to replace “1, 2, 99” with “MALE, FEMALE, NA” respectively. In order to do this, we need to define what type of data is stored in this variable. We do not want to overwrite GENDER, so first we will make a copy of GENDER to GENDER2, then recode GENDER2. Since GENDER is a categorical variable, we will change the data type of GENDER2 accordingly.
1. Highlight the bankdata object. Choose Insert, Column.

Name the variable GENDER2 and enter GENDER as the Fill Expression:

2. Right-click on GENDER2 heading to highlight this variable and from the menu bar, select Data-Change Data Type
Type in
“FACTOR” in the new type field, then click OK.

3. From the menu bar, select Data-
Recode Select GENDER column,
click on “1” in the current value field and type “m” in the new value field.
Click on apply to replace the old values with the new values. Repeat the same
procedure for females (f) and missing values (NA). Click OK when you are done.

Now that we have recoded the variable EDUC, we can determine the difference in the percentage of males and females who have completed high school and college. Again, making this comparison will help us assess the plausibility that differences in educational levels actually explain the difference in salaries between men and women.
Since both the new education variable and the gender variable are categorical variables, the appropriate procedure to assess differences in educational level across the two genders is to generate a Cross-Classification table. We do this in the following manner.
1- Under Statistics-Data Summaries, click on Crosstabulations.
2- In the dialog
box which opens, highlight the variables GENDER and EDUC2 in the variables box,
and click OK.
*** Crosstabulations ***
Call:
crosstabs(formula = ~ GENDER + EDUC2, data = bank, na.action = na.exclude, drop.unused.levels = T)
470 cases in table
+----------+
|N |
|N/RowTotal|
|N/ColTotal|
|N/Total |
+----------+
GENDER |EDUC2
|0+thr11|11+th15|15+th21|RowTotl|
-------+-------+-------+-------+-------+
MALE | 23 |150 | 82 |255 |
|0.09 |0.59 |0.32 |0.54 |
|0.43 |0.48 |0.77 | |
|0.049 |0.32 |0.17 | |
-------+-------+-------+-------+-------+
FEMALE | 30 |160 | 25 |215 |
|0.14 |0.74 |0.12 |0.46 |
|0.57 |0.52 |0.23 | |
|0.064 |0.34 |0.053 | |
-------+-------+-------+-------+-------+
ColTotl|53 |310 |107 |470 |
|0.11 |0.66 |0.23 | |
-------+-------+-------+-------+-------+
Test for independence of all factors
Chi^2 = 28.41314 d.f.= 2 (p=6.763404e-007)
Yates' correction not used
From this table, we can see that while 32% of male employees
have a college degree or higher, only 12% of female employees have such an
education. Thus, it is at least possible that education might account for
salary differences between males and female bank employees.
In order to fully assess whether education accounts for the salary difference, we can utilize a procedure called linear regression. Using regression, we can also examine the influence of the other factors that we hypothesized might account for salary differences: age, previous job experience, and length of tenure with the bank.
While there are already variables in the data set for job experience (PREVEXP) and tenure (JOBTIME), there is currently no variable for age, only one for date of birth. To remedy this situation, we will compute a new variable for age.
We will compute a new variable, AGE, in the following way:
1. From the menu bar, select Data - Transform.
2. Type the name of the new variable that you will create in the box under the "Target Column" heading.
3. Create the mathematical expression that
represents the new variable, by typing that expression into the box under
"Expression".

4. Click on "OK" when done.
Now that we have created the variable, AGE, we can proceed with the Linear Regression.
We could investigate a cross-tab and plot of each pair of variables. But since we've already shown the possible relevance of at least three explanatory variables, let's turn to a method which allows us to consider all of these variables simultaneously.

To generate a Linear Regression that addresses our research question, let’s first make two new data sets from our bank data – one for females and one for males.
This creates two new objects, BANKGEN.1 and BANKGEN.2.
First, we’ll generate a linear regression for males (BANKGEN.1). Open the Regression dialog box and indicate our dependent variable and various independent variables that we believe predict the dependent variable. The steps are as follows.
1. From the menu bar, select Statistic - Regression - Linear.

2. Choose BANKGEN.1 for the Data Set.
3. Click on SALARY in the Dependent box; choose all of our predictor variables (i.e., GENDER, AGE, EDUC, JOBTIME, and PREVEXP) by holding down the control key and clicking on the variables in Independent box.
4. Click "OK" when done.

The output from this procedure is as follows:
*** Linear Model ***
Call: lm(formula = SALARY ~ EDUC + JOBTIME + PREVEXP + AGE, data = BANKGEN.1, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-28047 -10178 -2147 8053 67208
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) -57548.9250 12042.3934 -4.7789 0.0000
EDUC 4250.4127 351.1139 12.1055 0.0000
JOBTIME 119.6168 91.3030 1.3101 0.1914
PREVEXP -51.6270 27.1415 -1.9021 0.0583
AGE 747.8231 288.2533 2.5943 0.0100
Residual standard error: 14720 on 249 degrees of freedom
Multiple R-Squared: 0.4437
F-statistic: 49.65 on 4 and 249 degrees of freedom, the p-value is 0
1 observations deleted due to missing values
Next perform this process on BANKGEN.2, the data for females. Your output should look like this:
*** Linear Model ***
Call: lm(formula = SALARY ~ EDUC + JOBTIME + PREVEXP + AGE, data = BANKGEN.2, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-9476 -4229 -1057 2832 25607
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 9332.7805 4529.7120 2.0603 0.0406
EDUC 1573.5586 194.6606 8.0836 0.0000
JOBTIME 27.9902 45.2959 0.6179 0.5373
PREVEXP 1.9628 6.8998 0.2845 0.7763
AGE -111.1421 49.0685 -2.2650 0.0245
Residual standard error: 6314 on 210 degrees of freedom
Multiple R-Squared: 0.3179
F-statistic: 24.47 on 4 and 210 degrees of freedom, the p-value is 1.11e-016
salary = 9332.7805 + 1573.56*educ + 27.99*jobtime + 1.96*preexp - 111.14*age
This suggests that, once we have controlled for all of these variables simultaneously, each year of additional education increases women's salaries by about $1573.56, each additional month of job time increases their salaries $27.99, and each level of previous experience adds $1.96 -- but each additional year old the respondent is reduces their salary an average of $111.42. Men, by contrast, get almost three times the benefit for each year of education ($4250.41 vs. $1573.56), five times the benefit for previous experience ($119.62 vs. $27.99), and benefit rather than detriment for age (+$757.82 vs. -$111.14), but detriment rather than benefit for previous experience (-$51.63 vs. +$1.96).
These regression results indicate that education does have a significant effect on salary. The coefficient for education is significant for both men and women. However, even after controlling for the effects of educational differences between sexes, gender continues to have a significant effect: Each additional year of education worth an additional $1573 for women and $4250 for men, on average. While some other coefficients are not statistically significant, the coefficients for age are close and also indicate gender disparity: Each additional year of age increases men's salaries by $747 but lowers women's salaries by $111. This suggests the possibility of either some other unmodeled difference between the sexes or sexual discrimination on the part of the bank.
The regression results indicate that education does have a significant effect on salary (significance level of the parameter estimate is displayed in the column labeled "Pr(>|t|)"). However, even after controlling for the effects of educational differences between the sexes, gender continues to have a significant effect. Specifically, female employees make almost $9000 less than male employees, after controlling for education, previous job experience, and tenure with the bank. This suggests the possibility of either some other unmodeled difference between the sexes or sexual discrimination on the part of the bank.
S-PLUS saves data automatically and when you restart, all data from previous sessions are ready for use. Nevertheless, when you are finished with your analyses, it is important to save all of your work. This may include the S-PLUS data file that you have modified, the output from the procedures, and the text commands that you have entered into the Script window.
An S-PLUS data file contains the actual data, variable and
value labels, and missing values that appear in the S-PLUS Data window. You
should definitely save this file UNLESS you do not want to keep any of the
modifications that you have made to your data. By default, the names of S-PLUS
data files are given the extension, ".SDD".
The output from your procedures appears in the S-PLUS Object
Explorer window. These results can be saved to a file. By default, the names of
S-PLUS output files are given the extension, ".SRP".
If you have submitted commands to S-PLUS using Script
Window, you may want to save the commands that you have typed in the S-PLUS
Script window. These commands can be kept for future reference (indicating what
precisely you did) and to replicate the same analyses. By default, the names of
S-PLUS script files are given the extension, ".SSC".
Right now, we will only be concerned with saving the data file that you have modified during this course. You may save the data file by activating the data window, going to the File menu, selecting Save, and typing in a file name (again, S-PLUS data files should always end with the ".SDD" suffix in order to identify them to S-PLUS).
You can call the Research Computing Support Center between
the hours of 9:00 a.m. through 5:00 p.m. Monday through Friday at
243-8800. You may also send email,
at any time, to res-consult@virginia.edu,
either to ask questions or to arrange an appointment with a Statistical
Computing Consultant.
Helpful Web Pages:
You can rearrange the position of windows in S-PLUS any way you wish.
To bring a partially hidden window to the front, click once
anywhere in that window. (Note that some other window now becomes partially
hidden).
To find a window that is not even partially hidden, choose
WINDOWS from the menu bar, and then select the item you wish to view. (You can
either use the mouse to click on one of the numbered options, or click the
appropriate number on the keyboard.
To move any window:
Point to the title of the window with your mouse.
Press the left mouse button down
Drag the window to the desired spot.
Release the mouse button.
To resize/reshape any window:
· Slide the mouse to an edge (or corner) until the cursor changes to a double arrow.
· Press the left mouse button down and hold it down.
· Drag the cursor, holding the button down, and notice the dotted-line box that indicates the new possible size/shape.
· Release the mouse button when the dotted box reaches the size/shape you want.
What is a statistical package? It is a computer program or set of programs that provides many different statistical procedures within a unified framework. The advantages of such packages are many. They are much easier to use than most programming languages. They allow you to run complex analyses without getting bogged down in the details of computations, and because of wide use, they are less likely to have unknown "bugs." The principle disadvantage of such packages is that they sometimes make doing statistics too easy. It is possible to apply complex procedures inappropriately or to properly apply a procedure and then misinterpret the results. They also do such a nice job presenting output that the unwary user may be lulled into a sense of complacency, leading to a failure to detect errors (such as reading the data incorrectly). A more common problem, however, is the over-analysis of data. Since analyses are so simple to run, it is very easy to generate a huge pile of output, with the numbers you really need lost somewhere in the middle.
What statistical packages are available at UVA and supported by ITC? Information Technology and Communication (ITC) provides access to a number of general-purpose statistical packages. They include MINITAB, SPSS, Amos, Lisrel/PRELIS, S-PLUS, and SAS. Additionally, ITC has site licenses to provide faculty, staff, and in some cases, students with statistical software for Linux, Windows or Macintosh. The licensing conditions and requirements vary, so if you have any questions about getting a copy of one of these statistical packages, please contact the statistical computing consultants at res-consult@virginia.edu or by calling the Research Computing Support Center at 243-8800. For more information, please see http://www.itc.virginia.edu/researchers/services.html
Research
Computing Support
Home Page