Tutorial: Getting Started on the Birch Linux Cluster

This tutorial is designed for researchers who are new to the Birch Linux cluster. It covers basic information about the cluster, as well as how to create and submit batch jobs using the PBS resource management software. It also contains sample job command files that can be used as templates for running jobs under PBS.

Table of Contents




The Birch Linux Cluster at UVA

The Birch Linux Cluster is a 32-node distributed-memory multi-processor system. Each node of the cluster contains two 2.4 GHz Intel Xeon Pentium IV processors with 256KB of cache (per cpu) and 2 GB of RAM (per node). The nodes are interconected with Gigabit Ethernet (60-110 Mbytes/sec bandwidth, 50-200 usec latency) and Myrinet (up to 250 Mbytes/sec bandwidth, 9 usec or less latency).

The Birch Linux cluster uses Red Hat Linux as its operating system and the Portable Batch System (PBS) software to distribute the computational workload across the nodes. PBS is a batch job scheduling application that provides the facility for building, submitting, and processing batch jobs on the cluster.

Jobs are submitted to the cluster by creating a PBS job command file that specifies certain attributes of the job, such as how long the job is expected to run and how many nodes of the cluster are requested. PBS then schedules when the job is to start running on the cluster, runs and monitors the job at the scheduled time, and returns any output to the user once the job completes.


Logging on to the Cluster

Logins to the Linux cluster can be done through the machine birch.itc.virginia.edu by slogin or ssh. This places you on the head node of the cluster, which acts as the control console for any interactive work such as source code editing, compilation, and submitting jobs through PBS. Applications should not be run interactively on birch.itc, but rather on other machines. When you log on to the cluster you should be in your blue.unix home directory.

Important notice for Windows users: do not use a standard Windows editor such as Notepad to edit files that will be used on the Linux or other Unix systems. The two systems use different sequences of control characters to mark the end of line (EOL). If you are using the clusters from a Windows system, there are a number of options:

More information about using Unix systems from Windows machines can be found at www.itc.virginia.edu/research/unix/windows_tools.


Configuring Your Account

Use of the Birch Cluster assumes familiarity with the Unix/Linux software environment. In order to use PBS for batch job submission, it may be necessary to configure some of your Unix account startup files. General information about the Unix operating system can be found at the URL www.itc.virginia.edu/research/unixbasics.html.

When a job is submited to the cluster through PBS a new login to your account is initiated, and any initialization commands in your startup files (.profile, .variables.ksh, .kshrc etc) are executed. PBS shells are not interactive and hence it is necessary to disable the interactive commands such as setting tset and stty. If these precautions are not taken then error messages will be written to the batch job's error file and your program may not run.

The recommended procedure to disable the interactive sections of the startup files is to test the environment variable PBS_ENVIRONMENT, which is set when PBS runs. If the variable has been set, meaning a PBS job has initiated the login, the interactive parts of the startup files are skipped.

Below is an example of a .profile file configured for use with PBS on the Birch cluster.

# The following command exports variables set here to your user shell.
set -a

# This command runs your ".variables.ksh" file.
. ${HOME}/.variables.ksh

# Exclude interactive commands & umenu if LL_JOB is TRUE (SP batch job)
# or PBS_ENVIRONMENT is set (PBS batch jobs on any architecture)
if [ -z "$LL_JOB" -a -z "$PBS_ENVIRONMENT" ] ; then

# Make /home/ intial Linux command prompt directory path
cd /home/$USER

# Interactive lines such control key and terminal settings go here 


# Close exclusion of interactive section (SP and PBS batch job requirement)
fi
The following link shows a complete .profile modified to run PBS jobs using the Korn shell. If you are using the tcsh shell, the following link shows a .login modified to run PBS jobs. You should also make sure any stty commands are done inside the PBS exclusion test in the .profile or .login.

Note: if you have trouble using the man command on Birch, in your .variables.ksh file replace the line

PAGER=/usr/bin/more
with
PAGER=more
This should work on all systems since more is normally in the path automatically.

Note that tcsh users may get the warning "Warning: no access to tty, thus no job control in this shell" as part of their PBS job output. This is documented on page 18 of the PBSPro User's Guide and does not affect the job.

To allow access to the PBS commands and manual pages, the appropriate paths have been added to the system PATH and MANPATH environment variables. Users should make sure they are including the system PATH and MANPATH variables as part of their account PATH and MANPATH variables (e.g. in .variables.ksh, PATH=${HOME}/bin:${PATH}:/home/loadl/bin:.).

Users may need to modify their PAGER variable (typically in the .variables.ksh file) to be /bin/more so that the man command will work correctly on the cluster.




Using Modules to Load Software

The Birch cluster uses modules to manage the setting of paths and other environment variables for particular software packages, such as the compilers and the MPICH environment. In particular, Birch offers two sets of compilers and two message layers. These are reflected in four available modules, one of which must be loaded in order to use MPI:
module load mpich-eth-gnu
module load mpich-gm-gnu
module load mpich-eth-intel
module load mpich-gm-intel

In general, the Intel compiler offers better performance, as well as the only Fortran 90/95 compiler on the cluster, but there are circumstances under which the gnu compiler might be chosen. Whether the Ethernet or Myrinet message layer will be preferable depends upon the application. Most "typical" code will probably run better under Myrinet, but code with relatively little communication relative to computations may well perform better under Ethernet. Both should be tried when a new code is introduced onto the cluster.

The modules command has a number of options, some of which are similar. For example, module add is synonymous with module load.

A full listing of the available modules can be obtained by typing

module which

Executing module which on Birch at a particular time yields

icc/7.0              : loads the Intel C++ Compiler Environment
ifc/7.0              : loads the Intel Fortran Compiler Environment
imsl/5.0             : loads the IMSL scientific library
mpich-eth-gnu/1.2.4  : loads the mpich environment for Gnu over Ethernet
mpich-eth-intel/1.2.4: loads the mpich environment for Intel over Ethernet
mpich-eth-pgi/1.2.5  : loads the CRAY Freeware GNU mpich environment
mpich-gm-gnu/1.2.4..9: loads the mpich environment for Gnu over Myrinet
mpich-gm-intel/1.2.4..9: loads the mpich environment for Intel over Myrinet
pgi/4.0              : loads the PGI Compiler Environment
pgi/5.0              : loads the PGI Compiler Environment



Compiling on Birch

Programs for which the user has the source code must first be compiled on birch.itc to run on the cluster. The Intel Compilers are licensed by ITC to run on Linux platforms at the University. The Intel compilers available on the Birch Cluster are:
icc [options] file.c, file.C, file.cpp, file.cxx	(C, C++)
					
ifort [options] file.f, file.f90, file.for, file.ftn	(Fortran 77/90/95)
					
For a complete list of options consult the relevant compiler man page, e.g. man ifc from you account on birch.itc. More detailed information about the Intel compilers can be found at
wwww.itc.virginia.edu/research/intel/
Documentation links are provided near the end of the page. Information about installing these compilers on your own Linux workstation can also be found on this Web page.

To compile parallel programs, the open source MPI (Message Passing Interface) libraries MPICH have been compiled with both the gnu and the Intel compilers. The appropriate module must be loaded in order to set up the correct compiler environment. MPICH is specific to compiler and to networking protocol. For example, to use an MPICH compiled with the Intel compiler over the Myrinet networking protocol, which is available only on Birch, the command would be

module load mpich-gm-intel
Once a module is loaded, the following commands should be used to compile programs that use MPI code:
mpicc [options] file.c                     (C)
				     
mpiCC [options] file.C                     (C++)
				      
mpif77 [options] file.f                    (Fortran 77)
				     
mpif90 [options] file.f                    (Fortran 90)
					

The following webpage provides information on using the MPI libraries.

www.itc.Virginia.EDU/research/mpi/
Once you have obtained an executable version of a program you want to run, whether it's source code you've compiled yourself or a third party software package, you must use the PBS resource manager to run the code on the cluster.


Portable Batch System (PBS)

The PBS resource management system handles the management and monitoring of the computational workload on the Birch cluster. Users submit "jobs" to the resource management system where they are queued until the system is ready to run them. PBS selects which jobs to run, when, and where, according to a predetermined site policy meant to balance competing user needs and to maximize efficient use of the cluster resources.

To use PBS, you create a batch job command file, which you submit to the PBS server to run on the cluster. A batch job file is simply a shell script containing the set of commands you want run on some set of cluster compute nodes. It also contains directives which specify the characteristics (attributes), and resource requirements (e.g. number of compute nodes and maximum runtime) that your job needs. Once you create your PBS job file, you can reuse it if you wish or modify it for subsequent runs.

PBS also provides a special kind of batch job called interactive-batch. An interactive-batch job is treated just like a regular batch job in that it is queued up, and must wait for resources to become available before it can run. Once it is started, however, the user's terminal input and output are connected to the job in what appears to be an rlogin session to one of the compute nodes. Many users find this useful for debugging their applications or for limited computational steering.

PBS provides two user interfaces for batch job submission: a command line interface (CLI) and a graphical user interface (GUI). Both interfaces provide the same functionality and you can use either one to interact with PBS. The CLI lets you type commands at the system prompt. The GUI is a graphical point-and-click interface; it is invoked with the command xpbs. A screen shot of xpbs is here.
The xpbs interface is composed of three windows: the first is the "Hosts Panel" and displays the the hostnames of the machines running PBS servers to which jobs can be submitted. In the case of the Birch cluster, the PBS server is running on the front-end login host birch.itc.virginia.edu and is labeled lc1. The second window is the "Queues Panel" and displays information about the queues managed by the server host selected in the "Hosts Panel". It shows the single queue "workq" in this example. The third window is the "Jobs Panel" and displays information about jobs that are found in the queue(s) selected from the Queues listbox.

Further information about how to configure and use the xpbs interface can be found in Chapter 5 of the PBS Pro User Guide. The remainder of this tutorial will focus on the PBS command line interface. More detailed information about using PBS can be found in the PBS Pro User Guide.


PBS Job Command File

To submit a job to run on the cluster, a PBS job command file must be created. The job command file is a shell script that contains PBS directives which are preceded by #PBS. The following is an example of a PBS command file to run a serial job, which would require only 1 processor on one node. In this example, the executable to be run is named serial_executable.

#!/bin/sh
#PBS -l nodes=1:ppn=1
#PBS -l walltime=12:00:00
#PBS -o output_filename
#PBS -j oe
#PBS -m bea
#PBS -M userid@virginia.edu

cd $PBS_O_WORKDIR
./serial_executable

The first line identifies this file as a shell script. The next several lines are PBS directives that must precede any commands to be executed by the shell (e.g. the last two lines). The PBS derectives illustrated are explained in the table below:

PBS Directive                         Function   

#PBS -l nodes=1:ppn=1          Specifies a PBS resource requirement of
                               1 compute node and 1 processor per node.

#PBS -l walltime=12:00:00      Specifies a PBS resource requirement of 
                               12 hours of wall clock time to run the job.

#PBS -o output_filename        Specifies the name of the file where job
                               output is to be saved. May be omitted to
                               generate filename appended with jobid number.

#PBS -j oe                     Specifies that job output and error messages
                               are to be joined in one file.

#PBS -m bea                    Specifies that PBS send email notification
                               when the job begins (b), ends (e), or 
                               aborts (a). 

#PBS -M userid@virginia.edu    Specifies the email address where PBS
                               notification is to be sent.

#PBS -V                        Specifies that all environment variables
                               are to be exported to the batch job.

It is not necessary to use the -j (join) directive; sometimes it is helpful to keep the output and error files separate. If -o or -e directives are not specified, PBS will assign a name to each consisting of the name of the script concatenated with .o for output and .e for error. This makes it possible for several runs to write to their standard output and standard error files without overwriting one another's results.

The following is an example of a PBS email notification to the user at the end of the job:

Date: Mon, 21 Oct 2002 23:06:47 -0400
From: adm 
To: userid@virginia.edu
Subject: PBS JOB 1187.lc1

PBS Job Id: 1187.lc1
Job Name:   script.sh
Execution terminated
Exit_status=0
resources_used.cpupercent=88
resources_used.cput=00:00:52
resources_used.mem=64248kb
resources_used.ncpus=1
resources_used.vmem=81036kb
resources_used.walltime=01:02:14

Note that the walltime-used information in the email should be used to accurately estimate the walltime resource requirement in the PBS job command file for future job submissions so that PBS can more effectively schedule the job. When submitting a particular PBS job for the first time, the walltime requirement should be overestimated to prevent premature job termination. The walltime measurement corresponds closely to the job cpu time since each job is allocated its own processor for execution.

After the PBS directives in the command file, the shell executes a change directory command to $PBS_O_WORKDIR, a PBS variable indicating the directory where the PBS job was submitted and nominally where the progam executable is located. Other shell commands can be executed as well. In the last line, the executable itself is invoked.

If the executable is a parallel program using the the Message Passing Interface (MPI), then it will require multiple processors of the cluster to run and this is specified in the PBS nodes resource requirement. In addition, the MPI script 'mpiexec' is used to invoke the parallel executable. The following is an example of a PBS command file to run a parallel (MPI) job over Myrinet:

#!/bin/sh
#PBS -l nodes=4:ppn=2
#PBS -l walltime=12:00:00
#PBS -o output_filename
#PBS -j oe
#PBS -m abe
#PBS -M userid@virginia.edu

cd $PBS_O_WORKDIR

mpiexec -comm mpich-gm executable_parallel

In this case the PBS nodes resource requirement specifies 2 processors per node on 4 nodes, for a total of 8 processors. The default behavior of mpiexec is to use all processors assigned by PBS, so it is not necessary to specify a processor number. See the manual page (man mpiexec) for more information about this command.

Parallel jobs should always specify a nodes requirement of 2 processors per node to efficiently partition the compute nodes for these jobs.

The PBS job command file can be given any name, although it is usually appended with a .sh extension to indicate it's a shell script, or perhaps a .sub to indicate is is a script to be submitted by qsub. The link pbs_script.sh is an example PBS job script that runs the High Performance Linpack benchmark across 4 nodes using the input file HPL.dat. You can download these to your cluster account and use them to test PBS job submission described below. Remember to change the userid placeholder in the PBS email directive to your own.


Submitting a Job

The PBS qsub command is used to submit job command files for scheduling and execution. For example, to submit your job with a PBS command file called "pbs_script.sh", the syntax would be
lc1: /home/uconsult $ qsub pbs_script.sh
 
1354.lc1.itc.virginia.edu

lc1: /home/uconsult $ 
Notice that upon successful submission of a job, PBS returns a job identifier of the form jobid.lc1.virginia.edu where jobid is an integer number assigned by PBS to that job. You'll need the job identifier for any actions involving the job, such as checking the job status, deleting the job, or specifying job dependencies as discussed below.

There are many options to the qsub command, as can be seen by typing man qsub at the Linux command prompt on birch.itc or by examining the PBS Pro User Guide. Three of the more useful ones are the -W option for allowing specification of additional job attributes, the -I option which declares that the job is to be run "interactively", and the -l option which allows resource requirements to be listed as part of the qsub command. These are discussed below.

Specifying Job Dependencies

The -W option allows for the specification of additional job attributes. In particular, the "-W depend=dependency_list" option to qsub defines the dependency between multiple jobs, which is useful if the jobs need to execute in a certain order. For example, if pbs_script2.sh should not start executing until pbs_script1.sh successfully completes because it needs a file that pbs_script1.sh creates, then these two jobs should be submitted to PBS in the following manner:

lc1: /home/uconsult $ qsub pbs_script1.sh

543.lc1.itc.virginia.edu

lc1: /home/uconsult $ qsub -W depend=afterok:543 pbs_script2.sh

544.frontend-0
After pbs_script1.sh is submitted, PBS returns the job identifier number, which is then used as part of the dependence argument list when pbs_script2.sh is submitted. The "afterok" argument in the dependency list indicates that the job identified as 543 must complete successfully before pbs_script2.sh will start.

Other options for arguments of the dependency list are detailed in Chapter 8 of PBS Pro User Guide as well as the online manual page for qsub; type man qsub at the Linux command prompt.

Submitting an Interactive Job

The -I option of qsub declares that a job has to be run "interactively." The job will be queued and scheduled as any PBS batch job, but when executed, the standard input, output, and error streams of the job are connected through qsub to the terminal session in which qsub is running. Interactive jobs with PBS should be used only for the purposes of testing and debugging the user's code, e.g. in cases using the TotalView debugger.

Once the PBS intereactive job is executed, the terminal session will be logged into one of the compute nodes allocated by PBS. The executable can then be invoked manually from the Linux command prompt.

To insure that a PBS interactive job is executed quickly, a small number of nodes and a short wallclock time should be specified. These reduced resource requirements can be listed as arguments of qsub with the -l option.

The following is an example of running the High Performance Linpack Benchmark as an interactive PBS job using 4 nodes and 10 minutes of walltime. Note that the terminal session is actually logged into node compute-0-4.

lc1: /home/uconsult $ qsub -I -l nodes=2:ppn=2 -l walltime=00:10:00

qsub: waiting for job 1352.lc1.itc.virginia.edu to start

qsub: job 1352.lc1.itc.virginia.edu ready

localstorage is in /jobtmp/pbstmp.1352.lc1.itc.virginia.edu

compute-0-4: /home/uconsult $ mpirun -np 4 -machinefile $PBS_NODEFILE \
                                            /opt/hpl-eth/bin/xhpl 

============================================================================
HPLinpack 1.0  --  High-Performance Linpack benchmark  --  September 27, 2000
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Labs.,  UTK
============================================================================

   [further output not shown]

compute-0-4: /home/uconsult $ exit

lc1: /home/uconsult $
An interactive PBS job submission should require no more than 4 processors (2 nodes, 2 processors each) for testing/debugging purposes. In addition, an interactive PBS job will not terminate until the user exits the terminal session. The allocated nodes will remain reserved as long as the terminal session is open, so it is extremely important that users exit their interactive sessions as soon as their debugging is done so that their nodes are returned to the available pool of processors.

Job Submission Policies

Because its primary mission is to support parallel computing, Birch is configured to favor such jobs. Users are restricted to a maximum of 4 jobs at one time in order to keep nodes available for parallel jobs. ITC reserves the right to make changes to the scheduling policy and/or the queue configuration in order to increase utilization of the cluster or to ensure that parallel jobs can be run.

The Aspen cluster has no such restriction, and should be the primary resource for users who must run a large number of serial jobs.

All PBS jobs submitted by users of the cluster will go to one execution queue called workq; the scheduler will first sort them by giving higher run priorities to jobs requiring shorter walltime and smaller node resource requirements. The scheduler further modifies these priorities based on a fair-share algorithm that tries to guarantee that on average, all users will get an equal amount of computing time. Finally, jobs that have been waiting to run for more than 24 hours will be considered "starving" and assigned a higher priority.

PBS is currently configured to limit the maximum amount of walltime a single job can use to 168 hours. When that time limit is reached, the job will be terminated whether it has completed or not. This insures that no one job can monopolize cluster compute nodes indefinitely. The time limit underscores the need for users to implement some type of save-restart mechanism in their code so they can restart the job close to where it was stopped and not lose all the work done up to that point. The following URL provides some guidelines for implementing save-restart in your code:
www.scd.ucar.edu/docs/chinook/save.html

PBS also imposes a limit on the number of processors users can require, based on how busy the cluster is. If no jobs are waiting in the queue to run, a user can request up to 64 processors. If the cluster becomes busy and jobs are waiting in the queue to run, the processor limit is reduced automatically to 32 processors in order to increase turnaround time.

The PBS configuration and scheduling policies used on the cluster will be periodically reviewed and modified as needed to insure efficient and equitable use of this high performance computing resource.

Researchers with extraordinary needs for the cluster, either in terms of extended compute time or number of nodes, should contact the Research Computing Support Group at res-consult@virginia.edu to discuss making special arrangements to meet those needs.


Displaying Job Status

The qstat -a command is used to obtain status information about jobs submitted to PBS.

lc1: /home/uconsult $ qstat -a

						    Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
1363.lc1	uconsult workq    job16x2     19094  16  32    --  00:20 R 00:02
1364.lc1	teh1m    workq    job12x2      7149  12  24    --  00:16 R 00:01
1365.lc1	teh1m    workq    job8x2       4166   8  16    --  00:12 R 00:00
1366.lc1	uconsult workq    job20x2       --   20  40    --  00:28 Q   -- 
1368.lc1	uconsult workq    STDIN       30942   2   4    --  00:10 R 00:02

lc1: /home/uconsult $ 
The first five fields of the display are self-explanatory. Note that job ID 1368 has a jobname of STDIN which is short for standard input, indicating that it is an interactive job. The sixth and seventh fields titled NDS and TSK in the above display indicate the total number of nodes and processors respectively required by each job. The ninth field indicates the required walltime (hrs:min.) and the last field shows the elapsed runtime. The tenth field titled S indicates the state of the job. The job state can have the following values:
State              Definition

E          Job is exiting after having run
H          Job is held
Q          Job is queued, eligible to run or be routed
R          Job is Running
T          Job is in transition (being moved to a new location)
W          Job is waiting for its requested execution time to be reached
S          Job is suspended

The following example shows how to use the qstat -f command to get detailed information on a specific job using its job identification number.

lc1: /home/uconsult $ qstat -f 1363
Job Id: 1363.lc1.itc.virginia.edu
Job_Name = job16x2
Job_Owner = uconsult@lc1.itc.virginia.edu
resources_used.cpupercent = 82
resources_used.cput = 00:01:59
resources_used.mem = 83384kb
resources_used.ncpus = 32
resources_used.vmem = 124920kb
resources_used.walltime = 00:02:33
job_state = R
queue = workq
server = lc1.itc.virginia.edu
Checkpoint = u
ctime = Fri Oct 25 03:00:41 2002
Error_Path = lc1.itc.virginia.edu:/h1/u/uc/uconsult/linux_cluster/job16x2.e1363
exec_host = compute-1-0/0+compute-0-15/0+compute-0-14/0+compute-0-13/0+comp
ute-0-12/0+compute-0-11/0+compute-0-10/0+compute-0-9/0+compute-0-8/0+co
mpute-0-7/0+compute-0-6/0+compute-0-5/0+compute-0-4/0+compute-0-3/0+com
pute-0-2/0+compute-0-1/0+compute-1-0/1+compute-0-15/1+compute-0-14/1+co
mpute-0-13/1+compute-0-12/1+compute-0-11/1+compute-0-10/1+compute-0-9/1
+compute-0-8/1+compute-0-7/1+compute-0-6/1+compute-0-5/1+compute-0-4/1+
compute-0-3/1+compute-0-2/1+compute-0-1/1
Hold_Types = n
Join_Path = oe
Keep_Files = n
Mail_Points = e
mtime = Fri Oct 25 03:00:42 2002
Output_Path = lc1.itc.virginia.edu:/h1/u/uc/uconsult/linux_cluster/16x2
Priority = 0
qtime = Fri Oct 25 03:00:41 2002
Rerunable = True
Resource_List.ncpus = 32
Resource_List.neednodes = 16:ppn=2
Resource_List.nodect = 16
Resource_List.nodes = 16:ppn=2
Resource_List.walltime = 20:00:00
session_id = 19094
Variable_List = PBS_O_HOME=/home/uconsult,PBS_O_LANG=en_US,
PBS_O_LOGNAME=uconsult,
PBS_O_PATH=/home/uconsult/bin:/usr/pbs/bin:/usr/share/mpi/bin:/uva/bin
:/usr/pgi/linux86/bin:/bin:/usr/bin:/usr/local/bin:/usr/bin/X11:/usr/X1
1R6/bin:.,PBS_O_MAIL=/var/spool/mail/uconsult,PBS_O_SHELL=/bin/ksh,
PBS_O_HOST=frontend-0,PBS_O_WORKDIR=/h1/u/uc/uconsult/linux_cluster,
PBS_O_SYSTEM=Linux,PBS_O_QUEUE=workq
comment = Job run at started on Fri Oct 25 at 03:00
etime = Fri Oct 25 03:00:41 2002

For further information about the qstat command, type man qstat on the cluster front-end machine lc0.itc or see the PBS Pro User Guide.




Canceling a Job

PBS provides the qdel command for deleting jobs from the system using the job identification number, as shown below.
lc1: /home/uconsult/linux_cluster $   qstat -a

						    Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
1361.lc1	uconsult workq    job16x2     18136  16  32    --  48:00 R 00:01


lc1: /home/uconsult/linux_cluster $ qdel 1361
lc1: /home/uconsult/linux_cluster $ qstat -a Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time --------------- -------- -------- ---------- ------ --- --- ------ ----- - ----- 1361.lc1 uconsult workq job16x2 18136 16 32 -- 48:00 E 00:01

For further information about the qdel command, type man qdel on the cluster front-end machine lc0.itc or see the PBS Pro User Guide.



Sample PBS Command Scripts

In this section are a number of sample PBS command files for different types of jobs.

Large Scratch/Output Files

A perl script call tmpsync has been installed on the Birch Linux Cluster to allow users to run programs that generate large scratch or output files without exceeding their home directory disk space quota. The PBS command file below shows how tmpsync can be used with the scatter/collect options to distribute/collect files associated with a parallel program to/from disk space on the cluster compute nodes. Once the program has completed, the front option to tmpsync can then be used to copy output files from the master compute node to /bigtmp on the frontend node (birch.itc). Files older than 72 hours are removed from /bigtmp, so users should download their output file to their own longer-term storage. See the section File Transfer to and from the Cluster for details about copying files.

Note that on the Birch Cluster, /bigtmp is local to to birch.itc and does not exist on the compute nodes. Also note that if the program is serial rather than parallel, the scatter and collect operations of tmpsync would not be needed since there would only be one execution node.

#!/bin/sh
#PBS -l nodes=2:ppn=2
#PBS -l walltime=00:02:00
#PBS -j oe
#PBS -m ea
#PBS -M uconsult@virginia.edu

#Load module for mpi
module add mpich-gm-intel

# Define variable for local storage on compute nodes associated with the job
LS="/jobtmp/pbstmp.$PBS_JOBID"

# Copy executable (e.g. xhpl) and data files (e.g. HPL.dat) from your 
# home directory to local storage on the master compute node
cd $LS
/bin/cp $HOME/xhpl .
/bin/cp $HOME/HPL.dat .

# If parallel program, synchronize local storage from master compute node 
# to slave compute nodes
/usr/bin/tmpsync -scatter

# Run parallel program using Ethernet 
mpiexec -comm mpich-p4 ./xhpl > xhpl_out

# If parallel program, synchronize local storage from slave compute nodes 
# to master compute node
/usr/bin/tmpsync -collect

# Copy all files from local storage on master compute node to 
#/bigtmp/pbstmp.$PBS_JOBID on
# front-end. where they can be examined and/or downloaded.
/usr/bin/tmpsync -front

Note: in this script, there should be no spaces around the equals sign in the line LS="/jobtmp/pbstmp.$PBS_JOBID".


File Transfer to and from the Cluster

Disk space on the home directory is extremely limited, and space on /bigtmp is temporary. Once your jobs have run, you will need to transfer your files to your local system for permanent storage. File tranfer to and from the cluster should be effected using a secure method such as scp or rsync.

If you are transferring to and from a Unix system (this includes Linux), the following are examples of transferring files from a directory mydirectory on the cluster front-end node birch.itc to a remote host, initiating the transfer either from birch.itc or from the remote host. These examples use the ksh line continuation character \ immediately followed by a newline.

Transfer from birch.itc (local source and remote destination):

/uva/bin/scp mydirectory/* \
userid@remote_host.virginia.edu:/home/userid/myoutput/.
Note: userid@ may be omitted if the user's id is the same on both systems. The colon after the hostname is essential, however.
/uva/bin/rsync -e ssh -a mydirectory/. \
 userid@remote_host.virginia.edu:/home/userid/myoutput/.    
Tranfer to remote_host (remote source and local destination):
/uva/bin/scp2 userid@birch.itc.virginia.edu:mydirectory/* \
/home/userid/myoutput/.

/uva/bin/rsync -e ssh -a  \
userid@birch.itc.virginia.edu:mydirectory/. \
/home/userid/myoutput/.

Mac OSX with Darwin includes scp and rsync, so these commands can be run inside the terminal application exactly as in the Unix examples above.

From a Windows system, use SecureFX, a commercial product available to students, faculty, and staff. The cluster runs ssh2; it does not run an ftp daemon, so sftp is the correct protocol for file transfers to the cluster frontend.

© 2008 by the Rector and Visitors of the University of Virginia.

The information contained on the University of Virginia’s Department of Information Technology and Communication (ITC) website is provided as a public service with the understanding that ITC makes no representations or warranties, either expressed or implied, concerning the accuracy, completeness, reliability or suitability of the information, including warrantees of title, non-infringement of copyright or patent rights of others. These pages are expected to represent the University of Virginia community and the State of Virginia in a professional manner in accordance with the University of Virginia’s Computing Policies.