Running dockerized Jupyter Lab/Notebook on HPC systems
This illustrates the steps to run any dockerized Jupyter Lab/Notebook on HPC system.
Creating the required Apptainer Container
For this example, we will use the NGIMS Metagenomics-Analysis-of-Biofilm-Microbiome project as it has a docker file with all the definitions and dependencies required to run the notebooks. The rest of the process assumes you have built the docker image from the docker file. Details on how to build a docker image from a docker file are found here.
Since most of the HPCs do not support docker containers (primarily due to security reasons as Apptainer does not need root privileges), we need to create an Apptainer image using the docker image; steps to create an Apptainer image based on docker containers (either private or public) are found here.
If you have a Linux system and would like to try running apptainer on your local system, the steps to install Apptainer on WSL Linux (or any Linux and Debian) distros are found here
Apptainer Image from a Docker Image:
The following command will convert a locally available docker image to an apptainer image (a common practice if you have both docker and apptainer available on your system)
apptainer build <apptainer_image_name>.sif docker-daemon://<docker_image_name>:<tag>
To get the Docker Image from Docker Hub (Public Image)
apptainer pull docker://<docker image>
To get the Docker Image from Docker Hub (Private Image)
We first need to log into the docker hub to get the private docker image (i.e., the one you have for your account). To log into the docker, we can use the following command
apptainer remote login --username myuser docker://docker.io
On running the above command, a prompt asking for token or password appears
Note: This type of login stores the token/password to /home/myuser/.apptainer/remote.yaml
We can use a one-off login if we do not want to store the tokens. For a one-off login, use the -login
flag
apptainer pull --docker-login docker://myuser/private
Jupyter from the Container
Once the apptainer image is built, the Jupyter Notebook can be launched from within the container. We need to create an SSH tunnel to launch and access the notebook hosted on HPC. The following script could be used to launch and create a tunnel launch the lab/notebook environment.
#!/bin/bash
PASSWORD_LOCATION=${HOME}/.jupyter/jupyter_notebook_config.json
CHECK_PASSWORD=true
TIMELIMIT="02:00:00"
CPUS_PER_TASK=2
MEM_PER_CPU=3800
JOB_NAME="apptainer-jupyter-notebook"
LOGIN_HOST=$(hostname -s)
NODE_TYPE="nodes" # change this to the node to be used
GPU_TYPE="ampere" # change the GPU type as per your cluster
NO_OF_GPU=1 #change as per necessity
function check_python {
if command -v python &> /dev/null ; then
echo "found python"
python_exe=python
elif command -v python3 &> /dev/null ; then
echo "found python3"
python_exe=python3
else
echo "Missing python and python3, we need a python interpreter to continue."
echo "Exiting..."
exit 1
fi
}
function password_set {
$python_exe - << END
import sys
import json
with open("$PASSWORD_LOCATION") as config_file:
data = json.load(config_file)
for n in data['NotebookApp']:
if 'password' == n:
sys.exit(0)
sys.exit(1)
END
}
# Countdown function to delay jupyter notebook startup but still show output to user
countdown()
(
IFS=:
set -- $*
secs=$(( ${1#0} * 3600 + ${2#0} * 60 + ${3#0} ))
while [ $secs -gt 0 ]
do
sleep 1 &
printf "\r%02d:%02d:%02d" $((secs/3600)) $(( (secs/60)%60)) $((secs%60))
secs=$(( $secs - 1 ))
wait
done
echo
)
# export function to make sure it can be used in subshells
export -f countdown
# Get all the passed in options
while getopts "dhn:t:c:m:" opt; do
case ${opt} in
d )
CHECK_PASSWORD=false
;;
n )
ENVIRONMENT=$OPTARG
;;
h )
echo "Usage: $(basename $0) [-h] [-d] [-n ENVIRONMENT] [-c CPUS] [-t TIMELIMIT] [-m MEM]"
echo " -h Display this message"
echo " -d Don't check to see if a password is set (default checks for password)"
echo " -n Name of conda environment (default=jupyter) that jupyter is"
echo " installed into. If you pass -n base it will use the base environment"
echo " -t Timelimit of job to run (default=${TIMELIMIT})"
echo " -c Number of cpus per task to pass to srun (default=$CPUS_PER_TASK)"
echo " -m Memory per cpu (default=${MEM_PER_CPU} megs)"
exit 0
;;
t )
TIMELIMIT=$OPTARG
;;
c )
CPUS_PER_TASK=$OPTARG
;;
m )
MEM_PER_CPU=$OPTARG
;;
\? )
echo "Invalid option: $OPTARG" 1>&2
exit -1
;;
: )
echo "Invalid option: $OPTARG requires an argument" 1>&2
exit -1
;;
esac
done
shift $((OPTIND -1))
# check for python or python3 and set the proper binary name
check_python
# Check for password
if $CHECK_PASSWORD ; then
# Assume we do need to set it
NEED_TO_SET=true
# See if file exists
if [ -d "$(dirname $PASSWORD_LOCATION)" ] && [ -f "$PASSWORD_LOCATION" ]; then
if password_set; then
# Password is already set
NEED_TO_SET=false
fi
elif [ ! -d "$(dirname $PASSWORD_LOCATION)" ]; then
# If the directory for the jupyter notebook config file that contains the password does not exist, jupyter will probably crash and complain. ensure the folder exists.
mkdir "$(dirname $PASSWORD_LOCATION)"
fi
if $NEED_TO_SET; then
echo "You need to set a password for jupyter notebook before you begin"
echo "Running \`jupyter notebook password\`"
jupyter notebook password
fi
fi
# Start up an interactive job to run Jupyter Notebook on with 0 full nodes,
srun -c $CPUS_PER_TASK -t $TIMELIMIT --mem-per-cpu $MEM_PER_CPU -J $JOB_NAME -p $NODE_TYPE --pty bash -c '
#source ~/anaconda3/etc/profile.d/conda.sh
#conda activate '$ENVIRONMENT'
module load apptainer
LOCAL_JN_PORT=$(expr 50000 + ${SLURM_JOBID: -4})
SSH_CTL=$TMPDIR/.ssh-tunnel-control
ssh -f -g -N -M -S $SSH_CTL \
-R *:$LOCAL_JN_PORT:localhost:$LOCAL_JN_PORT \
'$LOGIN_HOST'
echo
echo "========================================================"
echo
echo "Your Jupyter Notebook is now running"
echo "To Connect:"
echo "1) Mac/Linux/MobaXterm users: run the following command FROM A NEW LOCAL TERMINAL WINDOW (not this one)"
echo
echo "ssh -L${LOCAL_JN_PORT}:localhost:${LOCAL_JN_PORT} $USER@'$LOGIN_HOST'.lawrence.usd.edu"
echo
echo "For other users (PuTTY, etc) create a new SSH session and tunnel port $LOCAL_JN_PORT to localhost:$LOCAL_JN_PORT"
echo
echo "========================================================"
echo "Starting Jupyter Notebook in..."
countdown "00:00:10"
echo "========================================================"
echo
echo "Connect to your jupyter notebook with the following links."
echo "You need to use the password that was set for your jupyter notebook instance."
echo " http://localhost:$LOCAL_JN_PORT/ or"
echo " http://127.0.0.1:$LOCAL_JN_PORT/"
echo
echo "If you do not remember your password, you can set a new password by:"
echo " 1) exiting with Ctrl+c"
echo " 2) loading your conda jupyter environment"
echo " conda activate '$ENVIRONMENT'"
echo " 3) setting a password"
echo " jupyter notebook password"
echo " 4) re-running this startup script"
echo
sleep 5
echo "========================================================"
echo
apptainer run datascience-notebook_latest.sif jupyter notebook --no-browser --port=$LOCAL_JN_PORT
sleep 3
ssh -S $SSH_CTL -O exit localhost 2>&1 > /dev/null
sleep 3
exit'
Important Notes: The script uses some changeable parameters that could be modified as per necessity
TIMELIMIT defines how long the Jupyter Notebook will run; the time format is HH:MM:SS
CPUS_PER_TASK defines the number of CPU cores to be used
MEM_PER_CPU defines the RAM per CPU core, value specified in MB. Please use G after the number to specify in GB (eg. 8G). To allocate all the available memory for the node, use 0
apptainer run datascience-notebook_latest.sif jupyter notebook --no-browser --port=$LOCAL_JN_PORT
This is the line that actually starts the container in the script and hence the name of the .sif
should be changed as per necessity
NB : Apptainer allows us to access GPU, to use the GPU, a --nv
flag needs to be provided while running the container. But first, we have to change how the resources are allocated, as the above command runs an interactive job on a compute (CPU) node. To change the type of node where the apptainer starts
srun -c $CPUS_PER_TASK -t $TIMELIMIT --mem-per-cpu $MEM_PER_CPU -J $JOB_NAME -p $NODE_TYPE --gres=gpu:$GPU_TYPE:$NO_OF_GPU --pty bash -c '
Hence, the command to start the notebook to use GPU would be
apptainer run --nv datascience-notebook_latest.sif jupyter notebook --no-browser --port=$LOCAL_JN_PORT
Once all the changes are made, save the script with .sh
extension to the HPC cluster in the same directory, which has the .sif
file.
You can also use the jupyter lab server using:
apptainer run datascience-notebook_latest.sif jupyter lab --no-browser --port=$LOCAL_JN_PORT
Starting the notebook
To start the notebook, use the command
bash <bash_file_name>.sh
Creating an SSH Tunnel
To create an SSH Tunnel, copy and paste the line from the screen to a new terminal in MobaXterm or any Terminal; the line will start with SSH and has the following pattern.
ssh -L5XXXX:localhost:5XXXX <usd_username>@'<login_node>'.lawrence.usd.edu
Launch the Notebook from the Local Machine
Once completed and the Jupyter Lab Starts, the Jupyter Notebook could be launched from the local machine using the link provided from the screen with the following pattern.
http://127.0.0.1:5XXXX
P.S: Thanks to Bill Conn of the USDRcg team for the core script
Subscribe to my newsletter
Read articles from Anup Khanal directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by