High performance computing

URI offers two clusters for High Performance Computing. Seawulf is the educational cluster; Bluewaves is the research cluster. For more information on HPC at URI go to the HPC website.

Using HPC

There are two ways to run jobs on a cluster. First, you can run them interactively just like you do on your own computer. The advantage of this is you can work easily and directly. The disadvantage is that you might want to run a job that takes a long time and you’d like to take advantage of the computing power of a the cluster (that’s why you’re using it in the first place).

You are now logged in on on the “head node”. A computing cluster is just that - multiple computers attached together. Obviously it would be inefficient for many people to use the same computer on the cluster. It could even crash the cluster!

The above command launches an interactive session with 1 core for 8 hours executing bash shell.
When you submit an interactive session it will go to one of the available partitions. Partitions offer different options. On seawulf, the debug queue is for short jobs and is meant to be available without waiting. The general queue may require a wait; however, you may run your job for as long as necessary.

View partitions by typing sinfo.

You can now run almost all the same commands you do on your own computer. Because the cluster is running Linux (CentOS) there may be a few that are slightly different. For example, both man ls and ls —-help work on Linux. Additionally, while you have access to many installed programs you need to load them before you can use them.

wget https://hpc-carpentry.github.io/hpc-intro/files/bash-lesson.tar.gz

tar -xvf bash-lesson.tar.gz

If you are working with analyses that will take some time you should not use an interactive job. Instead you will write a script to submit a job. Your script might look like

#!/bin/bash
#SBATCH -t 1:00:00
#SBATCH -p general # partition (queue)
#SBATCH -N 1 # number of nodes
#SBATCH -n 1 # number of cores

module load StringTie/1.3.3-foss-2016b

touch SomeRandomNumbers.txt
for i in {1..100000}; do
  echo $RANDOM >> SomeRandomNumbers.txt
done

sort -n SomeRandomNumbers.txt > SortedRandomNumbers.txt

You are already familiar with the shebang line. All lines starting with #SBATCH indicate parameters related to job submission. #SBATCH -p general specifies the partition. #SBATCH -t 1:00:00 allows the job to run for up to 1 hour. #SBATCH -N 1 indicates you need 1 node. #SBATCH -n 1 indicates you need 1 processor on that node. There are many other possible SBATCH parameters not included here.

If you submit a job that you need to cancel first run squeue to get the job id then run scancel <jobid>

Note: If you have any large or important data you are working you should not store it in your home directory. Additional storage is connected to the cluster and access can be arranged by the HPC manager.

Now check how the cluster is doing with htop