HPC with Apptainer#

This guide describes how to run the Thesis framework on a shared HPC cluster (for example an HLS-HPC / SLURM system) using Apptainer (formerly Singularity) together with NeuroDesk transparent-singularity. This is how the pipeline was actually run on the author’s cluster.

For the local-workstation and Docker install paths — and a quick comparison of when to use each — see Installation and Setup Guide.

Why containers on HPC#

  • No root required. Apptainer runs unprivileged, so you can install the entire neuroimaging toolchain in user space without asking an administrator to add system packages.

  • Reproducible. Each tool (FSL, FreeSurfer, ANTs, and — for the MRtrix3 workflows — MRtrix3) is pinned to an exact container version, so a run on the cluster matches a run anywhere else with the same images.

  • GPU passthrough. Apptainer exposes the host GPUs to the container with --nv, so GPU-accelerated steps (e.g. probtrackx2_gpu, FireANTs) work where hardware allows.

  • Transparent binaries. transparent-singularity generates thin wrapper scripts that put bet, probtrackx2, recon-all, antsRegistration, mri_synthseg, etc. on your PATH. The framework calls them as if they were installed natively — no per-call apptainer exec boilerplate.

Prerequisites#

  • Apptainer / Singularity must already be available on the cluster (apptainer --version or singularity --version). It is normally provided by the site as a module or a system package — see the Apptainer install docs if it is not.

  • transparent-singularity — the wrapper-generator repository, cloned once per tool: NeuroDesk/transparent-singularity.

  • Bulk storage — somewhere with tens of GB free for the container images, conda environment, and Apptainer caches. On most clusters this is a project/group filesystem rather than $HOME.

Storage layout#

Keep small, frequently read configuration in $HOME and put the bulk data (containers, conda environment, caches) on the group filesystem. Throughout this guide $ROOT is a placeholder for that bulk storage root — set it to your own path, for example:

export ROOT=/path/to/bulk/storage     # e.g. /rg/<group>/users/<you>

The setup creates the following layout under $ROOT:

$ROOT/
├── containers/        # one transparent-singularity wrapper dir per tool
├── conda/
│   ├── envs/          # the pipeline conda environment
│   └── pkgs/          # conda package cache
├── .apptainer/
│   ├── cache/         # APPTAINER_CACHEDIR
│   └── tmp/           # APPTAINER_TMPDIR
└── freesurfer/
    ├── subjects/      # SUBJECTS_DIR
    └── license.txt    # FreeSurfer license (you provide this)

A shared env.sh#

The key to a reproducible setup is a single env.sh that is sourced by both login shells and sbatch jobs, so interactive and batched runs see exactly the same environment. It exports the Apptainer cache/tmp locations, a bind path so containers can see the group filesystem, the FSL output type, the FreeSurfer subjects directory and license, and — crucially — prepends each transparent-singularity wrapper directory to PATH.

# $ROOT/env.sh — sourced by login shells AND sbatch jobs
export ROOT=/path/to/bulk/storage

# Apptainer caches off $HOME
export APPTAINER_CACHEDIR="$ROOT/.apptainer/cache"
export APPTAINER_TMPDIR="$ROOT/.apptainer/tmp"

# Make the group filesystem visible inside every container
export SINGULARITY_BINDPATH=/path/to/group/filesystem   # e.g. /rg/<group>

# Tool environment
export FSLOUTPUTTYPE=NIFTI_GZ
export SUBJECTS_DIR="$ROOT/freesurfer/subjects"
export FS_LICENSE="$ROOT/freesurfer/license.txt"

# Put the transparent-singularity wrapper binaries on PATH
for d in "$ROOT"/containers/*/; do
  [ -d "$d" ] && PATH="$d:$PATH"
done
export PATH

Source it once from ~/.bashrc, guarded so it is only added when the file exists:

# in ~/.bashrc
[ -f /path/to/bulk/storage/env.sh ] && source /path/to/bulk/storage/env.sh

In SLURM batch scripts, source the same file at the top of the job:

#!/usr/bin/env bash
#SBATCH --gpus=1
source /path/to/bulk/storage/env.sh
thesis run -w full_pipeline -p 114823 -c full_pipeline

Installing the tool containers#

For each tool, clone transparent-singularity and run its install script. The script pulls the named container and generates the wrapper binaries into a per-tool directory; the --singularity-opts '--nv' flag enables GPU passthrough.

cd "$ROOT/containers"
git clone https://github.com/neurodesk/transparent-singularity

# Run once per tool/version you need:
./transparent-singularity/run_transparent_singularity.sh \
  fsl_6.0.7.22_20260416 --singularity-opts '--nv'

./transparent-singularity/run_transparent_singularity.sh \
  freesurfer_8.2.0_20260601 --singularity-opts '--nv'

./transparent-singularity/run_transparent_singularity.sh \
  ants_2.6.5_20260225 --singularity-opts '--nv'

Container names follow the NeuroDesk <tool>_<version>_<date> convention; pin the versions you want and adjust the dates to those actually published. Browse the NeuroDesk applications list (or the NeuroDesk docs) to find the exact <tool>_<version>_<date> string before running the install script — copy-pasting the example names above will fail to pull if those dates are no longer published. Each invocation creates a $ROOT/containers/<name>/ directory whose wrapper scripts are picked up by the PATH loop in env.sh.

This recipe installs FSL, FreeSurfer, and ANTs — the tools the ProbTrackX2 pipeline needs. For the mrtrix3 / full_pipeline_mrtrix3 workflows, add an MRtrix3 container the same way, e.g.:

./transparent-singularity/run_transparent_singularity.sh \
  mrtrix3_3.0.4_20260101 --singularity-opts '--nv'

(adjust the version/date to one published on NeuroDesk), and add it to the CONTAINERS array in the idempotent script below.

Conda environment for the pipeline#

The external tools live in containers, but the framework itself still needs a Python environment. Point conda’s package and environment directories at $ROOT (so nothing lands in $HOME), use conda-forge with strict channel priority, and create a minimal Python environment. We deliberately create a bare env here (just Python + pip) rather than from environment.yml: on HPC the neuroimaging tools (FSL, ANTs, FreeSurfer) come from the containers, so the conda ants package shipped by environment.yml is redundant. The Python dependencies are then pulled by pip install -e . from the package metadata.

conda config --add pkgs_dirs "$ROOT/conda/pkgs"
conda config --add envs_dirs "$ROOT/conda/envs"
conda config --add channels conda-forge
conda config --set channel_priority strict

conda create -y -p "$ROOT/conda/envs/thesis" python=3.11 pip
conda activate "$ROOT/conda/envs/thesis"

Then install the package itself from a clone of the repository; pip resolves the Python dependencies (nipype, nibabel, numpy, scipy, pydantic, …) declared in pyproject.toml:

git clone https://gitlab.liu.se/pasje442/thesis.git
cd thesis
pip install -e ".[dev]"

To train the learned_atlas cohort workflow on the cluster, also install the ml extra (pip install -e ".[dev,ml]"); it brings in torch>=2.0.0, which the training step uses on the GPU exposed via --nv. See Installation and Setup Guide (section 2.3) for the details of this optional extra.

FreeSurfer license#

mri_synthseg and other FreeSurfer tools require a (free) license file. Obtain one from the FreeSurfer site and drop it at the path env.sh points to:

cp /path/to/license.txt "$ROOT/freesurfer/license.txt"

Verification#

After sourcing env.sh, confirm the wrapped binaries resolve to the container scripts and the conda environment is active:

which bet probtrackx2 recon-all antsRegistration mri_synthseg
thesis info        # checks FSL and ANTs detection

# A quick functional smoke test
bet --help
probtrackx2 --help
antsRegistration --version
mri_synthseg --help

If any binary is missing, re-check that its $ROOT/containers/<name>/ directory exists and that env.sh has been sourced in the current shell.

Reference: idempotent setup script#

The blocks above can be assembled into a single idempotent script. Run it once to provision the cluster; re-running it is safe and only fills in what is missing.

#!/usr/bin/env bash
# setup_neuro_tools.sh — neuroimaging toolchain for an HPC cluster.
# Config lives in $HOME; bulk data (containers, conda, caches) lives on $ROOT.
set -euo pipefail

ROOT=/path/to/bulk/storage              # bulk storage root
NEED_GB=40                              # free space required for the containers
CONTAINERS=(
  fsl_6.0.7.22_20260416
  freesurfer_8.2.0_20260601
  ants_2.6.5_20260225
  # mrtrix3_3.0.4_20260101   # uncomment for the mrtrix3 / full_pipeline_mrtrix3 workflows
)
TS_REPO=https://github.com/neurodesk/transparent-singularity

# 1. Create the storage layout.
mkdir -p "$ROOT"/{containers,conda/{envs,pkgs},.apptainer/{cache,tmp},freesurfer/subjects}

# 2. Space gate — bail out early if the filesystem is too full.
avail_gb=$(df -BG --output=avail "$ROOT" | tail -1 | tr -dc '0-9')
[ "$avail_gb" -ge "$NEED_GB" ] || { echo "Need ${NEED_GB} GB free on $ROOT"; exit 1; }

# 3. Write $ROOT/env.sh (sourced by login shells and sbatch jobs).
cat > "$ROOT/env.sh" <<EOF
export ROOT="$ROOT"
export APPTAINER_CACHEDIR="\$ROOT/.apptainer/cache"
export APPTAINER_TMPDIR="\$ROOT/.apptainer/tmp"
export SINGULARITY_BINDPATH=/path/to/group/filesystem
export FSLOUTPUTTYPE=NIFTI_GZ
export SUBJECTS_DIR="\$ROOT/freesurfer/subjects"
export FS_LICENSE="\$ROOT/freesurfer/license.txt"
for d in "\$ROOT"/containers/*/; do [ -d "\$d" ] && PATH="\$d:\$PATH"; done
export PATH
EOF

# 4. Add a guarded source line to ~/.bashrc (only once).
grep -qF "source $ROOT/env.sh" ~/.bashrc 2>/dev/null || \
  echo "[ -f $ROOT/env.sh ] && source $ROOT/env.sh" >> ~/.bashrc
source "$ROOT/env.sh"

# 5. Conda channels and directories on $ROOT.
conda config --add pkgs_dirs "$ROOT/conda/pkgs"
conda config --add envs_dirs "$ROOT/conda/envs"
conda config --add channels conda-forge
conda config --set channel_priority strict

# 6. Install each tool container via transparent-singularity (GPU passthrough).
cd "$ROOT/containers"
[ -d transparent-singularity ] || git clone "$TS_REPO"
for name in "${CONTAINERS[@]}"; do
  [ -d "$ROOT/containers/$name" ] || \
    ./transparent-singularity/run_transparent_singularity.sh \
      "$name" --singularity-opts '--nv'
done

# 7. Conda environment for the pipeline (bare Python; deps come from pip install -e .).
conda create -y -p "$ROOT/conda/envs/thesis" python=3.11 pip

# 8. Verify the key binaries are on PATH.
for bin in bet probtrackx2 recon-all antsRegistration mri_synthseg; do
  command -v "$bin" >/dev/null || echo "WARNING: $bin not on PATH"
done

echo "Done. Remember to drop your FreeSurfer license at $ROOT/freesurfer/license.txt"

Replace /path/to/bulk/storage and /path/to/group/filesystem with your cluster’s paths, and adjust the container versions in CONTAINERS to those published for your site.