Skip to content

COMP Slurm Docs

Requirements

  • Confirm with your Professor or TA that your course has access to Slurm
  • Confirm that you have a active Computer Science account
  • Confirm that you can ssh to mimi.cs.mcgill.ca with SSH keys from your client machine

Resources

Host GPU Specs
gpu-teach-01 10 x NVIDIA RTX A2000 12GB
gpu-teach-02 10 x NVIDIA RTX A2000 12GB
gpu-teach-03 4 x NVIDIA RTX 5000 32GB
gpu-grad-01 10 x NVIDIA RTX A5000 24GB
gpu-grad-02 8 x NVIDIA RTX A5000 24GB

Limits

(normal QOS, your account default may vary)

  • Jobs per user = 2
  • Maximum run time = 4 hours
  • Maximum CPU cores = 16

Easy HowTo

  • ssh cs-username@mimi.cs.mcgill.ca - Connect to a mimi node
  • module load slurm - Load Slurm module
  • srun -p all --mem=1GB -t 1:00:00 --ntasks=1 batch.sh - Run your code/commands

Sample srun

Info

Check state of the slurm cluster using: sinfo

Advanced Usage

  • ssh to mimi.cs.mcgill.ca
  • Create your own batch file eg. myfile.sh
#!/bin/bash
#
#SBATCH -p all # partition (queue)
#SBATCH -c 4 # number of cores
#SBATCH --mem=4G
#SBATCH --propagate=NONE # IMPORTANT for long jobs
#SBATCH -t 0-2:00 # time (D-HH:MM)
#SBATCH -o slurm.%N.%j.out # STDOUT
#SBATCH -e slurm.%N.%j.err # STDERR
#SBATCH --qos=QOS_FOR_COURSE_OR_PI # Ask your TA/PI which QOS to use
#SBATCH --account=SEMESTER-COURSE # Ask your TA/PI for which account to use
module load miniconda/miniconda-fall2024 # Load necessary modules
#add your python runs, etc...

Check [supported python Modules on the slurm nodes](modules.md)

* `module load slurm` - Load Slurm module
* submit your job

```sh
sbatch myfile.sh

Using VScode Remotely with Slurm

  • Confirm you have met the above requirements

Setup mimi account

  • ssh to mimi.cs.mcgill.ca
  • On mimi run the command: vscode-remote-setup

Sample output

  • Create your own vscode file eg. myvscode.sh (you can copy the contents below and customize )
#!/bin/bash
#
#SBATCH -p all # partition (queue)
#SBATCH -c 4 # number of cores
#SBATCH --mem=4G
#SBATCH --gpus=1 # Make sure that number is within what is allowed by the QOS
#SBATCH --propagate=NONE # IMPORTANT for long jobs
#SBATCH -t 0-2:00 # time (D-HH:MM)
#SBATCH -o slurm.%N.%j.out # STDOUT
#SBATCH -e slurm.%N.%j.err # STDERR
#SBATCH --qos=QOS_FOR_COURSE_OR_PI # Ask your TA/PI which QOS to use
#SBATCH --account=SEMESTER-COURSE # Ask your TA/PI for which account to use
#SBATCH --signal=B:TERM@60
### store relevant environment variables to a file in the home folder
env | awk -F= '$1~/^(SLURM|CUDA|NVIDIA_)/{print "export "$0}' > ~/.slurm-envvar.bash

module load dropbear # Necessary module to access slrum node

cleanup() {
    echo "Caught signal - removing SLURM env file"
    rm -f ~/.slurm-envvar.bash
}
trap 'cleanup' SIGTERM

### start the dropbear SSH server
# Make sure you change PORT_CHANGE_ME to the port given when you ran vscode-remote-setup
dropbear \
    -r ~/.dropbear/server-key -F -E -w -s -p PORT_CHANGE_ME \
    -P ~/.dropbear/var/run/dropbear.pid

Make sure you updated the following in the above file:

  • QOS_FOR_COURSE_OR_PI
  • SEMESTER-COURSE
  • PORT_CHANGE_ME (use value from vscode-remote-setup)

Now you are ready to submit your vscode job to enable remote access.

  • module load slurm - Load Slurm module
  • submit your job
sbatch myvscode.sh
  • Take note of the above job id and run the following to get the NODE_NAME to access with VScode
squeue --job Your_Job_ID

Setup your VScode machine

From the previous step you will need to know:

  • CS_USERNAME
  • PORT_CHANGE_ME
  • NODE_NAME

Add the following to your ~/.ssh/config on your client/home machine, modifying the above values in the config

Host mimi
    User CS_USERNAME
    Hostname mimi.cs.mcgill.ca

Host cs-slurm
    HostName NODE_NAME
    ProxyJump mimi
    User CS_USERNAME
    Port PORT_CHANGE_ME

You can now use VScode with the Remote-SSH Extension and select cs-slurm as the remote host

  • Note you might need to enter your SSH passphrase twice if you have not added your ssh key to your agent
  • Note you will have to accept the ssh key the first time after setting up vscode-remote-setup

If you run into problems please send an email to science.it@mcgill.ca with the subject [SLURM]