Installation and Setup Guide#

The main way to install the framework is from the project’s GitLab package registrypip install thesis against the LiU GitLab PyPI index (section 1). Every tagged release is built and published there by CI, so you get a versioned wheel without cloning the source. Use the source/editable install (section 2) only when you are developing the framework itself.

Whichever install you choose, you then run the framework in one of three execution environments:

  1. Local workstation — a Python environment plus the package, with the external neuroimaging tools installed separately. Best for development and interactive runs.

  2. Docker container — a single GPU-capable image that bundles FSL (CPU and GPU), ANTs, MRtrix3, and SynthSeg. The same image runs CPU-only on a plain docker run and GPU-accelerated with docker run --gpus all. Best for reproducible batch processing. See section 3.

  3. HPC via Apptainer — pull each tool as its own Apptainer/Singularity container using NeuroDesk transparent-singularity, then run under SLURM. See HPC with Apptainer.

Quick pick: install from the GitLab registry (section 1) for normal use; use the source install (section 2) for development; use the Docker image for reproducible batch processing (CPU or GPU); use the HPC/Apptainer path when you have no root on a shared cluster.



2. Source / Editable Install (development)#

Use this path when you are modifying the framework. Most workflows also require external neuroimaging tools (FSL, MRtrix3, FreeSurfer/SynthSeg) that conda does not install for you. Read section 2.4 and install the tools your workflows need before running anything.

2.1 Clone the repository#

git clone https://gitlab.liu.se/pasje442/thesis.git
cd thesis

2.2 Create the conda environment#

environment.yml installs the Python dependencies and ANTs (ants>=2.6.5 from conda-forge):

conda env create -f environment.yml
conda activate thesis

If you prefer not to use conda, you can use a virtual environment instead, but you must then install ANTs yourself (see the prerequisites below):

python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate

2.3 Install the package (editable)#

pip install -e ".[dev]"          # development dependencies
pip install -e ".[dev,docs]"     # also include documentation dependencies
pip install -e ".[ml]"           # PyTorch, only for the learned_atlas workflow

If you only want to run the framework (not modify it), prefer the registry install in section 1 instead of cloning and editable-installing.

The ml extra brings in torch>=2.0.0 and is required only for training the learned_atlas cohort workflow. torch is lazy-imported inside that workflow’s training step, so the CLI, package import, and every other workflow work without it — install [ml] only if you run learned_atlas.

2.4 Prerequisites (external tools)#

The framework orchestrates several external neuroimaging tools. Each tool’s install is more involved than a one-liner, so the canonical install page is linked below — follow it and make sure the named binaries are on your PATH. Only the tools your chosen workflows use are required.

Tool

What it is for / which workflows need it

Install

FSL

ProbTrackX2 tractography and preprocessing (probtrackx2, bedpostx, dtifit, topup, eddy, fslmaths). Needed by hcp, preprocess, the ProbTrackX2 backend of tract_synthseg/full_pipeline, and the tract_similarity* and qc workflows.

Official FSL install docs

ANTs

Registration and transform application (antsRegistration, antsApplyTransforms, N4, DenoiseImage). Needed by registration, transform, atlas_to_patient, full_pipeline, and the N4 step of preprocess.

Provided by environment.yml (conda). Otherwise conda install -c conda-forge ants (conda-forge). venv users: install the ANTs release binaries and put them on your PATH.

FreeSurfer / SynthSeg

Brain segmentation via mri_synthseg. Needed by synthseg, tract_synthseg, the SynthSeg step of preprocess, and full_pipeline.

FreeSurfer download & install (ships mri_synthseg) or standalone SynthSeg

MRtrix3

MRtrix3 tractography (5ttgen, dwi2response, dwi2fod, mtnormalise, tckgen, tcksift2, tckmap). Needed by mrtrix3 and the MRtrix3 backend of tract_synthseg/full_pipeline.

MRtrix3 download

FireANTs (optional)

Alternative GPU registration backend, used when registration.method: fireants. Requires a CUDA GPU when registration.fireants.device: cuda (the default).

FireANTs (pip install fireants)

FSLeyes (optional)

QC viewer the registration workflow can auto-open when registration.viewer.auto_open: true. Bundled with FSL.

Installed with FSL (above)

The ANTs conda comment notes that versions older than 2.6.5 (e.g. 2.2.0) write sform_code=0 into transformed images and can crash ProbTrackX2 with inv(): matrix is singular. Pin ants>=2.6.5.

2.5 Verify the installation#

thesis --version
python -c "import thesis; print(thesis.__version__)"

# External tools (verify the ones your workflows use)
antsApplyTransforms --version
probtrackx2 --help
mrconvert --version       # MRtrix3
mri_synthseg --help       # FreeSurfer / SynthSeg
thesis info

thesis info prints the version, Python/platform details, dependency versions, and checks for FSL and ANTs only. It does not verify MRtrix3, FreeSurfer/SynthSeg, FSLeyes, or FireANTs — confirm those with the per-tool commands above.

A healthy report shows the package version and a found status for the tools you use, for example:

thesis 0.5.0
...
FSL:  found (/usr/local/fsl)
ANTs: found (antsApplyTransforms 2.6.5)

A not found for a tool you do not use is expected and harmless — only the tools required by your chosen workflows (see the table in section 2.4) need to resolve.

2.6 Run your first workflow#

With the package installed and verified, run a workflow once to confirm the pipeline wiring end to end.

  1. Place your input data. Each workflow reads a per-patient input directory under paths.inputs_dir. The default is data/raw/<patient_id>/; the hcp protocol overrides this to data/processed/hcp/<patient_id>/ (see Configuration Guide). Override either with the paths.inputs_dir config key.

  2. Get a subject. The examples throughout the docs use the HCP subject 114823. Bring your own HCP-preprocessed subject and lay it out as the hcp protocol expects:

    data/processed/hcp/114823/
      T1w/
        T1w_acpc_dc_restore_1.25.nii.gz
        Diffusion/            # bvals, bvecs, data.nii.gz, nodif_brain_mask.nii.gz
        Diffusion.bedpostX/   # merged samples, dyads, ...
    

    (HCP-preprocessed subjects are distributed by the Human Connectome Project; the Docker image bundles Aspera Connect for that download — see section 3.)

  3. Dry-run first, then run for real. --dry-run builds the workflow graph without executing any tool, so it surfaces configuration and path errors in seconds:

    thesis run -w hcp -p 114823 -c default --dry-run
    thesis run -w hcp -p 114823 -c default
    

For the full menu of workflows and what each one needs, see the Workflow Usage Guide and CLI Reference guides.


3. Docker Container (CPU and GPU)#

The repository ships a single, GPU-capable, multi-stage Dockerfile assembled from pinned upstream tool images (a “dependent-container” build): ANTs comes from antsx/ants, MRtrix3 from mrtrix3/mrtrix3, SynthSeg/SynthStrip from freesurfer/freesurfer, and FSL (CPU and CUDA GPU binaries) from the FSL public conda channel. The same image runs CPU-only on a plain docker run and GPU-accelerated with docker run --gpus all.

3.1 Build (or pull)#

Tagged stable releases are published to the project’s GitLab container registry by CI (registry.gitlab.liu.se/pasje442/thesis:<tag> / :latest). You can also build it locally:

docker build -t thesis:latest .

3.2 Run on CPU#

No special flags — the image runs on any host. With no GPU visible, the framework’s GPU auto-detection (core/gpu.py) finds no nvidia-smi and falls back to the CPU probtrackx2:

docker run --rm \
  -v /path/to/data:/input:ro \
  -v /path/to/results:/output \
  -v /path/to/workdir:/scratch \
  thesis:latest \
  run -w hcp -p 114823 -c cloud

3.3 Run on GPU#

Add --gpus all. This requires the NVIDIA Container Toolkit on the host (register it once with sudo nvidia-ctk runtime configure --runtime=docker && sudo systemctl restart docker). The container ships no NVIDIA driver — the toolkit injects the host driver and nvidia-smi at run time. FSL’s GPU tools statically link the CUDA 11.0 runtime, so no CUDA toolkit is installed in the image; only a compatible host driver is needed.

docker run --rm --gpus all \
  -v /path/to/data:/input:ro \
  -v /path/to/results:/output \
  -v /path/to/workdir:/scratch \
  thesis:latest \
  run -w hcp -p 114823 -c cloud

With --gpus all, core/gpu.py detects nvidia-smi reporting a CUDA version and the probtrackx2_gpu11.0 binary on $FSLDIR/bin, and selects the GPU tractography path (the same mechanism that picks bedpostx_gpu and eddy_cuda).

nvidia-smi must be visible inside the container. GPU selection is gated on nvidia-smi reporting CUDA Version: X.Y, which is provided by the utility driver capability. The image sets NVIDIA_DRIVER_CAPABILITIES=compute,utility for you — do not override it to just compute, or the GPU is silently unused even when CUDA compute works.

On the workstation use docker run --gpus all; on the HPC cluster use Apptainer --nv (HPC with Apptainer) — both expose the host GPUs to the same GPU-aware binaries.

3.4 docker compose#

For ergonomic, repeatable runs, docker-compose.yml defines two services that share the one image — thesis (CPU) and thesis-gpu (with the NVIDIA GPU reservation):

# Build once
docker compose build

# CPU run
docker compose run --rm thesis run -w hcp -p 114823 -c cloud

# GPU run (host needs the NVIDIA Container Toolkit)
docker compose run --rm thesis-gpu run -w hcp -p 114823 -c cloud

Mount paths and the image tag are env-overridable, e.g. INPUT_DIR=/data OUTPUT_DIR=/results docker compose run --rm thesis-gpu info.

3.5 What the image bundles#

  • FSL (FSL public conda channel) — CPU binaries (fsl-avwutils, fsl-bet2, fsl-flirt, fsl-fast, fsl-fdt, fsl-ptx2, fsl-eddy, fsl-topup) and the CUDA 11.0 GPU binaries (fsl-ptx2-cuda-11.0probtrackx2_gpu11.0, fsl-fdt-cuda-11.0xfibres_gpu driven by bedpostx_gpu, fsl-eddy-cuda-11.0eddy_cuda). fsl-fast is included because 5ttgen fsl (the MRtrix3 default 5TT algorithm) calls FSL fast.

  • ANTs 2.6.5 — copied from antsx/ants:2.6.5 (antsRegistration, antsApplyTransforms, N4BiasFieldCorrection).

  • MRtrix3 3.0.8 — copied from mrtrix3/mrtrix3:3.0.8 (mrconvert, 5ttgen, dwi2response, dwi2fod, mtnormalise, tckgen, tcksift2, tckmap, tckinfo, maskfilter, 5tt2gmwmi, …).

  • SynthSeg + SynthStripmri_synthseg, mri_synthstrip (the default brain-extraction tool) and their models/launcher copied from freesurfer/freesurfer (plus TensorFlow CPU and surfa), to avoid the full FreeSurfer tarball.

  • Aspera Connect (ascp) for HCP data download.

  • The thesis package itself.

Mount points are /input, /output, and /scratch; the container runs as a non-root user. ENTRYPOINT is thesis and the default CMD is info. The config directory is copied to /config and every *.example.yaml is renamed to *.yaml at build time, so cloud.yaml is available inside the image (read it for the default container paths).

GPU notes: SynthSeg/SynthStrip run on CPU TensorFlow in this image (their GPU acceleration is not bundled). FireANTs and FSLeyes are not bundled; for FireANTs GPU registration use a local install or the HPC/Apptainer setup.

FreeSurfer license: mri_synthseg and mri_synthstrip run license-free on FreeSurfer 8.x, so the image sets no FS_LICENSE. If a future/strict FreeSurfer build demands a license, run with -e FS_LICENSE=/license.txt -v /path/license.txt:/license.txt:ro (the compose file exposes an optional FS_LICENSE env var for this).

Image size: the single GPU-capable image is not smaller than the old CPU-only one. It trades size for simplicity and GPU support: the nvidia/cuda runtime base adds ~2 GB over a plain ubuntu base, and the re-added MRtrix3 + ANTs trees and the FSL GPU/vtk-base packages add more. The benefit is one image that works on both CPU and GPU hosts with no rebuild.


4. HPC via Apptainer#

On clusters where you have no root access, run each external tool as its own Apptainer/Singularity container using NeuroDesk transparent-singularity, then run the pipeline under SLURM. The full, idempotent recipe (bulk storage layout, env.sh, container install, conda env, and verification) is documented in HPC with Apptainer.


Configuration#

Hardware configuration#

Copy config/hardware.example.yaml to config/hardware.yaml and adjust it for your system (there is no checked-in hardware.yaml; it is generated from the template):

cp config/hardware.example.yaml config/hardware.yaml

Example:

hardware:
  threads: 8
  memory_gb: 16
  gpu_enabled: false
  n_gpu_procs: 1
  n_gpus: 1

Key fields:

  • threads: CPU threads available to workflow execution

  • memory_gb: memory budget for scheduling and local processing

  • gpu_enabled: enable GPU-aware node selection and scheduling

  • n_gpu_procs: concurrent GPU worker slots exposed to Nipype

  • n_gpus: physical GPUs visible to each worker slot

The config files merge hierarchically: default.yamlhardware.yaml → protocol config → config/patients/{patient_id}.yaml → CLI flags. Paths support relative paths (from the project root), absolute paths, ~, and $ENV_VAR expansion.


Running Tests#

pytest tests/unit/                              # unit tests only
pytest tests/                                   # all tests
pytest tests/ -m "not slow"                     # skip slow tests
pytest tests/ --cov=src/thesis --cov-report=term-missing   # with coverage
pytest tests/unit/test_bedpostx_paths.py -v     # a single test file

Building Documentation#

make docs            # sphinx-build -b html docs docs/build/html
make docs-strict     # warnings-as-errors (used in CI)

View the output:

open docs/build/html/index.html       # macOS
xdg-open docs/build/html/index.html   # Linux
start docs\build\html\index.html      # Windows

Development Setup#

Pre-commit hooks#

pre-commit install

This runs black, isort, flake8, and mypy on every commit (and unit, non-slow pytest on push).

Code quality#

black src/thesis tests/      # format (line length: 100)
isort src/thesis tests/      # sort imports
flake8 src/thesis tests/     # lint
mypy src/thesis              # type check
make check                   # lint + type-check verification

Troubleshooting#

Import errors#

  • Verify the package is installed: pip install -e ".[dev]".

  • Confirm the thesis conda environment is active.

FSL / ANTs not found#

  • Verify the binaries are on your PATH.

  • Run thesis info to inspect FSL/ANTs detection (other tools are not checked there; verify them with the per-tool commands in section 2.5).

  • Check workflow-specific paths in config/default.yaml, config/protocols/*.yaml, and config/patients/*.yaml.