Introduction to HPC

What is HPC?
When should I use it?
Choosing the right facility
How to apply
How to use
Checklist

What is HPC?

High-performance computing (HPC) facilities allow researchers to log in remotely and submit compute jobs - essentially the programs that you want to run - from the command line. They are generally large clusters of many processors with a queue-based job submission system. They have many open-source and some commercial software packages installed, and are also often used to run users' own code. Researchers are generally allocated compute time through a merit-based allocation process resembling a mini-grant, but it is usually also possible to get a small amount of time to try the system out, without going through the full allocation process.

At the most basic level, the HPC systems available for your use are just large, very capable, Linux servers. If you're familiar with Linux, any flavour of Unix, or the command line on Mac OS X, you'll have no trouble getting up to speed with HPC. If you're a Windows user the learning curve will be a little steeper, but plenty of help is available to get you on your way.

When should I use it?

HPC systems are most often useful for researchers:

  • with parallelisable tasks (either using multi-threaded software, or just many tasks which can be submitted independently and simultaneously).
  • with computing jobs needing more memory (RAM) than available on their local systems.
  • needing access to software packages offered by the HPC facilities.

If you are running out of memory on your desktop system or can no longer process your data quickly enough, then HPC could be for you.

HPC works best for jobs that can be run in a non-interactive batch mode. These jobs can be inserted into a queue, and when the time comes for them to be executed they can run without any user intervention. This means that HPC systems rarely sit idle and it has the benefit that you can enqueue your jobs and get on with other things while you wait for them to execute. Many programs have a batch mode, but not all do. If you're accustomed to using your program interactively, you'll have to learn how to use its batch mode in order to use it for HPC.

Choosing the right facility

As a researcher you have access to a number of facilities:

  • The National Facility at NCI (National Computing Infrastructure), which serves researchers nationally.
  • Intersect's McLaren facility, which serves researchers in NSW.
  • Facilities local to your institution or faculty

For details about the last option, please contact your institution.

A few factors come into play when selecting between the National Facility and Intersect's McLaren facility:

  • The availability of "specialised facilities" tailored to your needs.
  • The availability of your software prerequisites.
  • The extent of your processing needs.
  • The nature and extent of your memory needs.

To make the right choice, take the following steps:

  1. Have a look NCI's Specialised Facilities. At the time of writing, NCI offers specialised Bioinformatics and Imaging and Visualisation facilities. If your research is in either of these areas, then NCI could well be the right choice for you.
  2. Have a look at the software available at the National Facility and on McLaren. The following links go to the same page - you can navigate between machines using the drop-down menu on the page:
  3. Don't be dissuaded if your required piece for software is not listed - researchers can request to have new software packages installed; however, requests cannot always be fulfilled.
  4. Have a look at the hardware available at the National Facility and on McLaren:
    • Vayu is a Sun Constellation Cluster with 1492 nodes, each containing 2 quad core Nehalem processors summing up to 11,936 cores. 37TB RAM and 800 TB disk space. Commissioned in 2010.
    • McLaren is a shared memory SGI Altix 4700 with 128 Dual-Core CPUs. 1TB RAM and 12 TB disk space. Commissioned in 2008.
  5. Gain a more detailed understanding of the hardware involved here:
  6. Consider the memory requirements of your research problem. McLaren is a large-scale shared memory system while NCI’s Vayu is a distributed memory system. This means that McLaren is suited to certain classes of research applications, where access to large amounts of memory is a key performance requirement.

Example decision paths

  • Alice has a large amount of data she wants to process with Matlab. In her case, she cannot use Matlab's open source equivalent Octave because of compatibility issues. After checking the software listings, she chooses the National Facility because it supports Matlab.
  • Bob has an extremely memory intensive application. While the National Facility's Vayu cluster has the most memory, he opts for McLaren because his application can make use of McLaren's shared memory architecture.
  • Carlos wants to conduct a large-scale parametric search. His research problem is such that he can execute many instances of the same program at the same time, giving each a small allocation of the problem space to search. He wrote the program himself and it doesn't have an specific software dependencies. Although he could run his application on McLaren, he opts for the National Facility's Vayu cluster because it has a larger number of cores.
  • Dorothy wants to conduct computationally intensive gene sequencing. She applies to the National Facility so she can take advantage of the specialised bioinformatics facility.

Back to the top

How to apply

So you've chosen your facility - how do you gain access to it? The easiest way is to get a start-up account on the system you want to try. All HPC accounts are limited in terms of the amount of computing resources (e.g. processing time) you have access to. This is referred to as your quota, and when it runs out you'll have apply for more. Start-up accounts have a fairly limited quota, but it is enough for you to assess if HPC is right for you, i.e. to install your application, submit a job, and see if it all runs smoothly.

Once you've successfully completed your evaluation using a start up account, you can apply for a much larger quota. This application will be subject to appraisal along with other applications, and time will be granted on the basis of merit. Start-up accounts let you skip the full grant assessment process and give HPC a go.

A typical start-up account will give you 500 service units of computing resources. One SU usually corresponds to one CPU core-hour, so if you run a parallelised program that uses four processes and runs for one hour before finishing, you will have used 4 SUs of your quota.

To try out:

How to use it

Once your start-up account request has been processed, you should receive the details of the username and password required required to login to your account. You can then get onto connecting to the machine and submitting your first compute job.

When getting started on Vayu, see the User Guide at https://nf.nci.org.au/facilities/userguide/ for instructions on how to log in and use the job-submission system. If using McLaren, follow the instructions below.

Connecting to the machines

There are two key ways to access the HPC machines. The first is via an interactive terminal that allows you to issue commands, such as adding your compute job to the queue. The second mode of access is for transferring files to and from your computer and the HPC machine, for example the data your wish to process and, once the job is complete, the results files.

To connect interactively you'll need a Secure Shell (SSH) client. If you're on Linux or Mac OS X, chances are that you can just open a terminal and issue one of the following commands:

ssh mclaren.ac3.edu.au
ssh vayu.nci.org.au

When prompted, enter your username and password. If you're running Windows, you'll need to download an SSH client. A good free option is Putty.

To transfer files to and from your machine, you'll need a File Transfer Protocol (FTP) client that supports Secure-FTP (SFTP). You can do this from the command line on Linux, Mac OS X, and Windows:

ftp mclaren.ac3.edu.au
ftp vayu.nci.org.au

But you might find it easier to use a graphical FTP program, such as FileZilla.

More on Service Units

As noted above, in general, one service unit (SU) corresponds to one CPU core-hour computing on McLaren and the National Facility's Vayu cluster. In each merit allocation round (every six months), about one million SUs are available on McLaren and 2.2 million SUs on Vayu. Resource charging on McLaren is modified by the amount of memory used by your program, via the following formula:

U = W * max [n, m/4]

where U is the usage charge, W is the wallclock time (ie. actual elapsed time) used, n the number of CPUs allocated, and m the memory allocated (expressed in Gigabytes (GB)). If you are using more than 4GB worth of memory in a single processor application, you should consider parallelising your application to reduce your wall clock time.

Running jobs

All machines run the (ANU-)PBS batch queue system.

You can run jobs interactively for testing purposes by entering the commands directly on the command-line, but you will usually run jobs by putting the commands in a script file and submitting them to a queue. Interactively running jobs can be killed by the system automatically.

Important commands for the batch queue system are:

qsub <script name>   submit a job

qstat -f <job number>   see job status

qstat -u <user name>   list all jobs owned by a user

qstat -a   view all jobs

nqstat -a   view all jobs

qdel <job number>   delete a job from the queue

A simple PBS script example:

#!/bin/bash (1)

#PBS -A projectname (2)

#PBS -l walltime=02:00:00, vmem=2GB (3)

#PBS -j oe (4)

#PBS -l ncpus=4 (5)

#PBS -wd (6)

ulimit -s unlimited (7)

./yourprogram.exe (8)

Here's a line-by-line explanation:

(1) uses the bash shell to execute the script

(2) runs the job under the specified project (this is a requirement to submit a job)

(3) sets the limit for walltime (2h) and virtual memory

(4) combines the output and error of the PBS run into one file

(5) sets the number of cores for this job

(6) changes the directory to the directory from which the PBS script was submitted (working directory)

(7) ensures a large stack-size for the given job process

(8) executable commands should be placed from this point onwards

For additional sample scripts, refer to your local drive </usr/local/doc>.

By default, McLaren sets the following job limits:

  • Walltime: 200h
  • RAM: 64GB / 16 cores. If you want to use 128GB you have to ask for at least 32 cores.
  • Cores: no limit, but you might have to wait quite some time for the job to progress through the queue if it needs more than 32 cores.

Back to the top

Taking it further

You've selected the right facility, you've got your start-up account, you've run a test compute job and everything looks fine. How do you get more computing resources (ie. service units) so that you can run a full-scale compute job and get some real results?

Irrespective of whether your using Intersect's McLaren or NCI's Vayu, computing resources are granted in merit allocation rounds held periodically. You can apply in the following rounds:

  • McLaren merit allocation rounds (two per year). Rounds open in February for the period April-September, and in August for October-March. Further details can be obtained on the Intersect HPC Resource Allocation page, and the application form can be obtained here.
  • NCI merit allocation round. This occurs annually in October for allocations starting the following January. A mid-term call may be held in April for new projects for the July to December period. Further details and an application form can be obtained here.
  • Intersect has a partner share arrangement with NCI. This provides a second means of accessing the NCI facilities. Consider applying directly through NCI in the first instance and making an application through Intersect if your request to NCI is not granted. Further details can be obtained on the Intersect HPC Resource Allocation page, and the application form can be obtained here.

Checklist

It is recommended that you work through this checklist to help assess if HPC is of benefit to you:

  1. Estimate your computing needs. Work up a rough estimate of the number of CPU hours your compute job will require. As a guide a dual core desktop machine running flat out will provide you with about 1500 core hours (service units) per month. If your current computing infrastructure can handle your computing needs in the timeframe of your research project then skilling up in HPC might cost you time overall.
  2. Leave plenty of time. If you think HPC could be of use to you, then start investigating it early into your research project. It can take time to assess if your software will run on HPC and to become comfortable in using the command line, module, and queueing systems if you've never used similar systems before. Finally, because HPC is a shared resource, your job will most likely spend some time in the queue awaiting execution.
  3. Check for available software. Verify that your chosen facility supports the software your job depends on. This is particularly important in the case of commercial software which may not be available due to licensing costs.
  4. Consider the nature of your research problem. Can it inherently take advantage of the parallelisation of HPC? If not, can you run multiple instances at the same time with different parameters or data to process? If not, then your compute job may not be able to take advantage of HPC's capabilities.
  5. Check that your software can be used non-interactively. If your processing involves interacting with the program, for instance, clicking buttons or selecting input files using a file browser, then you'll need to find a way to do those things from the command line using a batch mode, if your program supports it.

For help, contact hpc_support@intersect.org.au (for McLaren or for Vayu/XE) or help@nf.nci.org.au (for Vayu/XE).