Campus Cluster Usage Guide

13 minute read

Overview

This article is written for people who are intended to utilize campus cluster at UIUC for computation-heavy work, especially for the Huang group members. I will be using electron ptychographic reconstruction as an example. The article will cover:

I. Introduction to campus cluster

II. Getting started, step-by-step guide

III. Other useful Slurm commands

IV. Common mistakes and additional information

V. My ptycho reconstruction setup

For template scripts, see my github repo https://github.com/chiahao3/ptycho

Last modified: 2022/11/15 by Chia-Hao Lee


I. Introduction to campus cluster

1. Why campus cluster?

Campus cluster provides powerful hardware (CPU, GPU, large memory) so that we could do larger scale of image and diffraction simulations, ptychographic, tomographic, cryo-EM reconstruction, and machine learning-related applications.

2. How do I use campus cluster, and is it free?

You need to either 1. become an investor (by purchasing hardware) or 2. join an existing investor. Engineering college already invested so we can just join them without paying extra, but it’s probably paid by the overhead from all incoming fundings anyway. See this link to a list of current investors.

3. What is the workflow while using campus cluster?

  1. Login to the campus cluster and you’ll land on the head node
  2. Get your script, data, code ready on the campus cluster storage
  3. Submit your job, it will be arranged on the queue schedule by Slurm (a scheduling software) and will be executed on the computing node(s) once the computing resources (nodes) are available
  4. Collect the computation results/files when the job is done

4. What is the hardware spec of campus cluster?

You may only use the hardware in the partitions that you’re granted access to. As an engineering college student, the primary queue is eng-research. Therefore, we can only access the following partitions:

  1. eng-research
    • 72 compute nodes providing 1,764 CPU cores
      • 9 dual-socket Intel Xeon E5-2680v4 Broadwell CPU nodes w/ 256GB RAM (224 CPU total cores) w/EDR InfiniBand interconnects
      • 63 dual-socket Intel Xeon E5-2690v3 Haswell CPU nodes w/256GB RAM (24 cores per node, 1,824 CPU total cores) w/EDR InfiniBand interconnects
  2. eng-research-gpu
    • 7 Compute nodes providing 280 CPU cores and 14 NVIDIA GPUs:
      • 7 dual-socket Intel Xeon Gold 6148 Skylake CPU nodes w/ 192GB RAM & 2 NVIDIA V100 16GB GPUs (280 total CPU cores) w/HDR InfiniBand interconnects
  3. secondary
    • 23 nodes with GPU and 3 of them have high-end V100 GPU cards
  4. test
    • short turnaround time but has no GPU capability.
  5. mrsec (optional, you need to additional request it and it contains only CPU nodes so not very useful for ptycho reconstrucitons)

Here’s a list of common nodes with high-end GPUs that we have access to:

Untitled

To see a full list of every accessible nodes with GPUs and their performance, see our Google Spreadsheet.

5. Where can I learn more about the campus cluster?

There are a huge amount of wonderful documents and manuals regarding how to use Slurm for cluster computing. I’ll just list a few here.


II. Getting started, step-by-step guide

1. Get access to the campus cluster

  1. Request access through this link, choose eng-research for primary queue if you’re affiliated with engineering college. The application usually get approved in 2 business days. You would be able to log in with your NetID after approval.

2. Log in to the campus cluster

  1. Install a SSH client to connect to the campus cluster. Recommend Bitvise for Windows, it has GUI for SFTP file transfer which is very beginner friendly.

    Untitled

  2. Open the SSH client, login with:

    Host: cc-login.campuscluster.illinois.edu, port:22

    username: NetID

    It will prompt you with the password, use your AD password.

    Untitled

  3. Open a terminal (command line interface, CLI) or SFTP window (file transfer)

3. Prepare your data, scripts, and programs (packages)

  1. Plan your overall workflow for data, output file, source code, packages, script, job file
    1. Computing with Matlab, C++, Python, or compiled executables?
    2. Do you need to input data, feed in parameters, or both?
    3. If you need to provide input data, where do you store them?
    4. Where do you want to save your output file? With the input data, or at a separate folder?
    5. How would you manage your job files?
    6. Plan and Practice these on your local machine first!
  2. Understand the difference between campus cluster file system (/home, /scratch, /project)
    1. /home: 5GB soft limit, 7GB hard limit, 7 days grace period, no purging (deleting)
    2. /scratch: 5TB soft limit, 10TB hard limit, purge files older than 30 days
    3. /project: Depends on the investment. Need to ask the Technical Representatives to get access to their specific folder. For engineering students, please contact the at techrep@engr.illinois.edu in order to be granted access to engineering storage.
  3. Upload your data, scripts, and packages to the corresponding locations
    1. Use a GUI would probably be much faster than cp, mv commands if you’re not familiar with Bash in Linux.
    2. Regarding the questions in 3.a, my setup is as followed:
      • Computing with Matlab package (fold_slice from Yi Jiang for electron ptychography). I save this under my home directory as /home/NetID/fold_slice
      • We need both input data and input parameters, since input data are large 4D-STEM dataset, it’s better to temporarily store them under /scratch/users/NetID/
      • Input parameters are related to each job submission, they’re much smaller in size so I put them with my job scripts under /home/ptycho_script/
        • Inside the /ptycho_script/folder, create job folders for management like job_20221109_Themis_1108_WSe2_16_EMPAD/
        • Within each job folder, I put all the relavent job scripts, log files, MATLAB scripts.
      • The PtychoShelves default reconstruction output directory is stored within the input project folder, therefore, also in /scratch/users/NetID/

4. Submit your job to the campus cluster

  1. Log in to campus cluster with terminal console (command line interface, CLI), you’ll land at the head node that looks like this:

    Untitled

  2. Change directory to where your job script is using $ cd <directory> For example, $ cd ptycho_script/job_20221109_Themis_1108_WSe2_16_EMPAD/
  3. Submit your sbacth script file using $ sbatch <script_name> For example, $ sbatch eth_2task_V100_runPtycho_01.sbatch
    • You may specify the node and dependency during submission
    • Once you submit the job, you should get a message in command line with Like “Submitted batch job 7238715”

5. Monitor your job status

  1. Use squeue --format="%.20i %.13P %.15j %.8u %.8T %.10M %.9l %.6D %R %S" --me to minitor the status of your submitted job, it could be either “Pending” or “Running”
    • For pending, it’ll show an expected START_TIME which is assigned by the scheduling algorithm at the moment. The expected START_TIME might change if users change their job run time while executing. Usually, your job will get executed earlier than the initially assigned time because people cancel their jobs (or their jobs failed).
    • If your job is already running, you may check the output txt log files using SFTP
  2. Use sacct --format=jobid,jobname,partition,nodelist,ncpus,start,end,elapsed,state,MaxRSS,ReqMem -j <JobID> to minotor the allocated resources of the job
    • Information including number of CPUs, start/end time, requested and maximum used memory, and see if your tasks are ran concurrently (with the same start time)

    Untitled

  3. Use scancel <JobID> to cancel your job if needed
    • Or scancel -u <UserName> to cancel all your jobs

To summarize, only 4 main commands that you need to know:

  1. sbatch to submit your job
  2. squeue to check on your submitted job
  3. sacct to check on your running and completed job
  4. scancel to cancel your job

III. Other useful Slurm commands

  • To submit to a specific node or with a dependency (to avoid the parallel computing toolbox error)

      sbatch -w <HostName> <ScriptName>
      # Example: sbatch -w ccc0077
    
      sbatch --dependency=afterany:<JobID> <ScriptName>
      # Presumably the license will free up once the previous job finish
    
  • To remove a queued job or delete a running job identified by JobID:

      scancel <JobID>
    
  • To show the node information

      scontrol show node=<HostName>
    
  • To display details of a specific active (running/pending) job identified by JobID:

      scontrol show job <JobID>
    
  • To show the default Slurm configuration of campus cluster

      scontrol show configuration
    
  • To change the time limit of a job

      scontrol update jobid=<JobID> TimeLimit=<NewTimeLimit>
      # Example: scontrol update jobid=5868183 TimeLimit=1-10:00:00
    
    
  • To write the submitted script of a submitted job

      scontrol write batch_script <JobID> [OptionalFilename]
    
  • To view the submitted jobs sorted by priory within partition (eng-research-gpu)

      squeue -p <Partition> -S -P -o "%.12i %.16P %.8j %.8u %.8T %.10M %.9l %.6D %R %S %e"
    
  • To view job and node status easier using .bashrc and pestat (technically these are Bash tips, not Slurm)

      # Add the following to .bashrc under $home to save some work, note that you need to download "pestat" first to use this alias
      alias squeue='squeue --format="%.20i %.13P %.15j %.8u %.8T %.10M %.9l %.6D %R %S"'
      alias pestatGPU='pestat -G -n ccc0078,ccc0079,ccc0080,ccc0081,ccc0082,ccc0083,ccc0084,golub121,golub122,golub123,golub124,golub125,golub126,golub127,golub128,golub305,golub346,golub347,golub348,golub349,golub378,golub379,golub380,ccc0037,ccc0060,ccc0076,ccc0077,ccc0215,ccc0286,ccc0287'
    
      # alias is used to generate shortcuts to these user-defined command with specific format
    

    Untitled

    Screeshot of pestatGPU . Note that “Freemem” has nothing to do with the scheduling or computation. The allocable/available memory is roughy 95% of the mounted Memsize. See this link for more information about allocable memory and free memory.

  • To view the partitions(queues) that users have the ability to submit batch jobs to

      sinfo -s -o "%.14R %.12l %.12L %.5D"
    
  • To view the accounting information for all jobs of a user:

      sacct -u <UserName> --format=jobid,jobname,partition,ncpus,start,end,elapsed,state
    
  • To start an interactive session

      **srun --partition=<Partition> --time=00:30:00 --gres=gpu:V100:2 --pty /bin/bash**
      exit #type this to leave the computing node
    

IV. Common mistakes and additioanl information

Common mistakes

  • Wrong path, file doesn’t exist, missing “/” while connecting directories
  • Wrong file version
  • The node requested has no resources that you need (e.g. GPU) or not enough (memory)
  • Wrong linebreak (Windows vs. Unix)
    • The line break format is different in Windows and Linux, avoid Windows Notepad, use WordPad or Notepad++ if possible
  • Windows/Linux compatibility
    • Linux file system is case SENSITIVE
    • Windows uses \ (backslash) for file directories, while Linux use / (forward slash)
  • Maximum number of users for Distrib_Computing_Toolbox reached (Limited Matlab license)
    • You may trywhile true; do <your_command>; sleep <interval_in_seconds>; done to keep submitting your job at certain intervals or just setup a dependency.

Additional information

  • The hierachy of the campus cluster is:
    • Cluster has a couple partitions (queues) like eng-research and eng-research-gpu
    • Each partition has a couple computing nodes (computing unit for resource requesting) like ccc0076
    • Each computing node has tens of CPUs, potentially a few GPUs, tons of memory, and interconnections to a larger storage
    • Each CPU has tens of independent processors (like 20 cores)
    • Each core is a physical unit that can process a single task/process at a time
    • “Threading” is the technique to virtually put more tasks within each physical core
  • Each node can host only 1 user, so node is also called Host. This is the basic computation unit for user. Besides, node should be considered as a physical entity (like a computer on a shelf)
  • User can request the needed resource so that a suitable node can be assigned.
    • If you request 1 CPU and 1 GPU but the node has 4 CPU and 4 GPU. You will not be able to access / use the rest 3 CPU and 3 GPU.
    • The scheduling only grant you the access to the specified resource on the node, not the entire node.
    • The usage fee is usually calculated by CPU/hr so if you’re only using 1 CPU and the rest of 19 CPUs are idling, you will not be charge for the entire node.

The difference of jobs, steps, and tasks

https://stackoverflow.com/questions/46506784/how-do-the-terms-job-task-and-step-relate-to-each-other

job consists in one or more steps, each consisting in one or more tasks each using one or more CPU.

Job is more like a feast that contain multiple dishes, and each dish take need multiple steps, each step may have a few tasks to process the ingredient.

Jobs can be both big and small, it can be a small job that run on a single node, consists of 1 step, 1 task on a core of 1 CPU; or it can be a big job that has multiple steps on multiple nodes, each node using multiple CPU cores to process multiple tasks.

Jobs are typically created with the sbatch command, steps are created with the srun command, tasks are requested, at the job level with --ntasks or --ntasks-per-node, or at the step level with --ntasks. CPUs are requested per task with --cpus-per-task. Note that jobs submitted with sbatch have one implicit step; the Bash script itself.


V. My ptycho reconstruction setup

1. Workflow

  1. Do experiments on the microscope, and keep exp parameters with the project name
    • Like 0124_18mrad_cl185mm_20.5Mx_df0nm_10pA_step128/
    • To keep things tidy, I save all my EMPAD project folders under my own chiahao3/ folder. All the fresh EMPAD project folders are moved to exp subfolders like 20221108_TEM_ThemisZ_80kV_WSe2_16_EMPAD/ at the end of the session. Therefore, there’s no need to create a bunch of project folders under the main empad_projects/ folder.
  2. Transfer the EMPAD data from microscope computer to the support PC
  3. Upload the EMPAD data from support PC to cloud Box storage using Globus
    • No need to wait for file transfer in person, everything can be done remotely!
    • Contact CQ to get access to the Globus endpoints on Themis/Talos support PC
    • Although we can upload the files directly from support PC to campus cluster using Globus, I still prefer checking the data on my local workstation before uploading to campus cluster
    • Box Sync will automatically download the files to my local workstation for further inspection/processing
  4. Log the experiment parameters into an EMPAD data summary Google spreadsheet
  5. Determine process priorities and reconstruction parameters
  6. Create corresponding script files
    • Matlab script runPtycho_cc.m
    • Slurm job script eth_2task_V100_runPtycho_01.sbatch
  7. Upload the job script files to home/ptycho_script/ and exp data to scratch/
  8. Log into head node, change the directory into the home/ptycho_script/<job_folder>, this is where you store the Slurm job .sbatch scripts
  9. Submit your job scripts using sbatch command
  10. Check on the job progress using squeue, sacct, and the txt log files/images with SFTP
    • The txt log files provide a convenient way of checking the reconstruction status, including any license issue, memory error, reconstruction error, or expected end time.
    • Typically I refresh the SFTP view to see if the txt file size gets larger
    • You may check the reconstructed images directly through SFTP as well
  11. Download the reconstruction results to local workstation, which will be automatically sync to the cloud Box storage for permanent storage. (Don’t forget that files in scratch/ get purged in 30 days)
  12. Keep notes about the reconstruction results in the EMPAD data summary Google spreadsheet and presumably make a summary PPT slide

2. Folder structure

  • Local workstation (Use Box Sync to sync with unlimited cloud Box storage)
    • D:\Box Sync\04_Scripting_Work\Ptychography\ptycho_script\ccluster_script\
      • Create separate job folders to store the scripts of ptycho reconstructions and the log files
  • Campus cluster
    • /home/chiahao3/ (This directory will not get purged but is limited to 7GB)

      Untitled

      • Save the fold_slice package here. The path to the package needs to be specified in the runPtycho_cc.m Matlab script
      • Create ptycho_script/ folder to keep all the job folders uploaded from local workstation

        Untitled

    • /scratch/users/chiahao3 (Files older than 30days in this directory will be purged, 10TB hard limit)

      Untitled

      • Save the experiment folder here. Each folder contains tens of EMPAD project folders. The EMPAD project folder has the raw data, and the reconstruction output will be saved to here as well.

        Untitled

        • Use different scan_number to create different result folders

          Untitled

3. Code descriptions

  • Matlab script runPtycho_cc.m
    • I combine the prepare_data.m (generate probes) and the ptycho_electron.m into a single runPtycho_cc.m with some rearrangement to put all common parameters at the front of the script.
    • Also add a few features including jobID, dScan (downsampling factor), and output param files (in case we keep reusing the script file by changing parameters)
    • There’re just too many parameters (exp and recon) for us to package the script into a function and pass a param file, so I decided to create an all-in-one script file with parameters inside to keep track of every major reconstruction trials.
  • Slurm job script eth_2task_V100_runPtycho_01.sbatch
    • Although each job script can contain tons of tasks with multiple nodes, I prefer using 1 node per job to make things clearer. Also cancelling a job would be easier if it’s running on a single node.
    • Note that we choose the node with 2 GPU cards, so each sbatch script has 2 srun commands to run 2 independent ptycho reconstructions concurrently. 1 reconstruction per GPU.
    • In the script header, the partition (queue), number of CPU, GPU, memory, and tasks are explicitly specified.
  • Additional scripts that might be handy
    • Grab the reconstructed object/probe images by walking through every EMPAD scan folders
    • Read the EMPAD scan folder names and generate corresponding Matlab scripts
    • Read the EMPAD scan folder names and generate corresponding

4. Template scripts

See my github repo https://github.com/chiahao3/ptycho

  • Matlab script runPtycho_cc.m
  • Slurm job script eng_2task_V100_runPtycho_01.sbatch
  • ptycho_recon_image_stack_generator.ipynb

Leave a comment