Tract Similarity Analysis Workflow#

The thesis.workflows.tract_similarity package compares a patient’s native- space ProbTrackX2 tract density against the cohort mean atlas warped into the same patient’s native space. Metrics span four families: overlap (Dice, Jaccard), voxelwise correlation (Pearson, Spearman, cosine), spatial distance (Hausdorff-95, mean surface, centroid) and distribution similarity (NMI, symmetric KL, Bhattacharyya).

Invoke with thesis run -w tract_similarity -p <patient_id> for per-patient metrics and thesis run -w tract_similarity_cohort for cohort aggregation.

The companion tract_similarity_sweep workflow performs a cohort-wide grid search over the two binarisation thresholds (subject_threshold.value, atlas_threshold.value) and reports the cell that maximises mean (or median) Dice across all patients. Outputs: sweep_per_patient.csv, sweep_summary.csv, sweep_best.json, and an optional sweep_heatmap.png. Invoke with thesis run -w tract_similarity_sweep -c tract_similarity_sweep.

thesis.workflows.tract_similarity#

Tract similarity analysis workflow.

Compares a patient’s native-space probtrackx2 tract density volume against the cohort mean atlas warped into the same patient’s native space, producing overlap, correlation, spatial-distance, and distribution-similarity metrics.

Register as thesis run -w tract_similarity -p <patient_id> -c <config> for per-patient metrics, thesis run -w tract_similarity_cohort -c <config> for cohort-level aggregation, and thesis run -w tract_similarity_hcp_loo -c <config> for per-HCP-subject leave-one-out metrics vs the cohort atlas.

thesis.workflows.tract_similarity.workflow#

Tract similarity analysis Nipype workflows.

Two workflows are registered here:

  • tract_similarity — per-patient. Loads the patient’s native-space probtrackx2 output and the warped cohort mean atlas, computes four families of similarity metrics (overlap, voxelwise correlation, spatial distance, distribution similarity), and writes metrics.json alongside the four NIfTI volumes used to compute those metrics: subject_normalized.nii.gz (waytotal-normalized probtrackx2 density), atlas_normalized.nii.gz (normalized warped atlas), subject_mask.nii.gz (thresholded subject binary mask), and atlas_mask.nii.gz (thresholded atlas binary mask).

  • tract_similarity_cohort — cohort-level. Aggregates every patient’s metrics.json into summary.csv, per_patient.csv, and outliers.json.

Prerequisites (verified pre-flight):

  • hcp workflow has produced tractography/probtrackx2/fdt_paths.nii.gz and waytotal for the patient (per-patient mode).

  • atlas -> atlas_to_patient has warped a cohort mean map into the patient’s native space (per-patient mode).

thesis.workflows.tract_similarity.workflow.build_workflow(*, config, context)[source]#

Build the per-patient tract_similarity workflow.

Inputs (probtrackx density, warped atlas) live under context.output_dir and are accessed via ts.probtrackx_relpath / ts.atlas_relpath from config — those support hemisphere-split layouts and glob patterns.

Parameters:
Return type:

Workflow

thesis.workflows.tract_similarity.workflow.build_cohort_workflow(*, config, context)[source]#

Build the cohort-level tract_similarity aggregation workflow.

Parameters:
Return type:

Workflow

thesis.workflows.tract_similarity.workflow.verify_requirements(config, context)[source]#

Pre-flight checks for the per-patient tract_similarity workflow.

Parameters:
Return type:

List[str]

thesis.workflows.tract_similarity.workflow.verify_cohort_requirements(config, context)[source]#

Pre-flight checks for the cohort aggregation workflow.

Parameters:
Return type:

List[str]

thesis.workflows.tract_similarity.workflow.apply_threshold(vol, mode, value)[source]#

Compute the binarisation cutoff for a volume.

mode="fraction" returns value * max(volume) (with a tiny floor to handle empty volumes); mode="absolute" returns value unchanged. Shared by the per-patient workflow and the threshold sweep so the cutoff formula has a single source of truth.

Parameters:
Return type:

float

thesis.workflows.tract_similarity._metrics#

Similarity metrics between two normalized 3-D tractography volumes.

All functions take two numpy.ndarray volumes of identical shape (already normalized to [0, 1]) and return plain float values suitable for JSON serialisation. Volumes may be continuous densities or binary masks depending on the metric family:

  • Overlap metrics take binary masks (callers threshold upstream).

  • Correlation / distribution metrics take continuous volumes.

  • Spatial-distance metrics take binary masks and a voxel-size tuple in mm.

The Dice implementation is imported from thesis.workflows.atlas.qc_metrics to avoid duplication.

thesis.workflows.tract_similarity._metrics.overlap_metrics(mask_probtrackx, mask_atlas)[source]#

Compute binary-overlap metrics between two masks of identical shape.

Parameters:
  • mask_probtrackx (ndarray) – Thresholded probtrackx2 mask (bool or 0/1).

  • mask_atlas (ndarray) – Thresholded atlas mask (bool or 0/1), identical shape.

Return type:

dict

Returns:

Dict with dice, jaccard, volume_ratio, volume_abs_diff, sensitivity, precision. Voxel-based (unitless) volumes in voxel count.

thesis.workflows.tract_similarity._metrics.correlation_metrics(vol_probtrackx, vol_atlas)[source]#

Compute voxelwise similarity metrics on continuous volumes.

Parameters:
  • vol_probtrackx (ndarray) – Normalized probtrackx2 density volume (float).

  • vol_atlas (ndarray) – Normalized atlas density volume (float), identical shape.

Return type:

dict

Returns:

Dict with pearson, spearman, cosine. Values in [-1, 1]. Returns NaN for metrics undefined on the given volumes (e.g., zero variance).

thesis.workflows.tract_similarity._metrics.distance_metrics(mask_probtrackx, mask_atlas, voxel_size_mm=(1.0, 1.0, 1.0))[source]#

Compute spatial-distance metrics between two binary masks.

Parameters:
  • mask_probtrackx (ndarray) – Thresholded probtrackx2 mask (bool or 0/1).

  • mask_atlas (ndarray) – Thresholded atlas mask (bool or 0/1), identical shape.

  • voxel_size_mm (Tuple[float, float, float]) – Physical voxel size (sx, sy, sz) in mm for distance conversion. Extracted from the NIfTI header by callers.

Return type:

dict

Returns:

Dict with hausdorff_95, mean_surface, centroid (all in mm). Returns NaN for an empty mask side.

thesis.workflows.tract_similarity._metrics.distribution_metrics(vol_probtrackx, vol_atlas, n_bins=64)[source]#

Compute distribution-level similarity metrics on continuous volumes.

The volumes are treated as probability distributions over voxels (each is normalised to sum to 1 over its non-negative support). For NMI, both sides are discretised to n_bins equal-width bins over [0, max].

Parameters:
  • vol_probtrackx (ndarray) – Normalized probtrackx2 density (non-negative float).

  • vol_atlas (ndarray) – Normalized atlas density (non-negative float), identical shape.

  • n_bins (int) – Histogram bin count for NMI.

Return type:

dict

Returns:

Dict with nmi, kl_symmetric, bhattacharyya.

thesis.workflows.tract_similarity._io#

Volume loading, normalisation, and resolution helpers.

Handles:

  • Loading fdt_paths.nii.gz and dividing by a scalar waytotal to produce a probability-like density in [0, 1].

  • Loading a warped atlas mean volume and min-max normalising it to [0, 1].

  • Extracting voxel spacing (mm) from a NIfTI header for distance metrics.

  • Resolving the warped-atlas file from either an explicit path or a glob pattern — the ANTs-generated suffix (e.g. _SyN_template_to_patient) varies with the registration transform type, so glob fallback is useful.

thesis.workflows.tract_similarity._io.discover_patient_dirs(input_dir, *, sort=True)[source]#

Return the patient subdirectories under a cohort output root.

A patient directory is any immediate subdirectory of input_dir whose name is not one of the reserved cohort scratch names (cohort, temp, work). Shared by the cohort task bodies and verifiers so the scan rule has a single source of truth.

Parameters:
  • input_dir (Path | str) – Cohort output root to scan.

  • sort (bool) – When True (default), return the directories sorted by name.

Return type:

list[Path]

Returns:

List of patient directory paths.

thesis.workflows.tract_similarity._io.load_fdt_paths_normalized(fdt_path, waytotal_path)[source]#

Load fdt_paths and divide by waytotal to give a probability-like volume.

Parameters:
  • fdt_path (Path | str) – Path to fdt_paths.nii.gz.

  • waytotal_path (Path | str) – Path to the waytotal text file.

Return type:

Tuple[ndarray, ndarray]

Returns:

Tuple (volume, affine) where volume is float32 in approximately [0, 1] (may exceed 1 in occasional high-count voxels) and affine is the 4x4 NIfTI affine from the fdt image.

thesis.workflows.tract_similarity._io.load_probtrackx_volume(probtrackx_dir, fdt_name='fdt_paths.nii.gz', waytotal_name='waytotal')[source]#

Load a patient’s probtrackx2 density, summing hemispheres when present.

Auto-detects layout:

  • Single run: probtrackx_dir/fdt_paths.nii.gz + probtrackx_dir/waytotal.

  • both-separately: sums the waytotal-normalised volumes from probtrackx_dir/left/ and probtrackx_dir/right/.

Matches how atlas._io._build_patient_volume combines hemispheres so that comparisons against the cohort mean atlas are like-for-like.

Parameters:
  • probtrackx_dir (Path | str) – Directory under the patient output (e.g. tractography/probtrackx2).

  • fdt_name (str) – Name of the fdt_paths file within each run directory.

  • waytotal_name (str) – Name of the waytotal text file within each run directory.

Return type:

Tuple[ndarray, ndarray]

Returns:

Tuple (volume, affine). Affine is taken from the first run loaded.

Raises:

FileNotFoundError – If no valid (fdt_paths, waytotal) pair is found.

thesis.workflows.tract_similarity._io.load_atlas_normalized(atlas_path)[source]#

Load a warped atlas map and min-max normalise to [0, 1].

Parameters:

atlas_path (Path | str) – Path to the warped atlas NIfTI in patient native space.

Return type:

Tuple[ndarray, ndarray]

Returns:

Tuple (volume, affine) where volume is float32 in [0, 1].

thesis.workflows.tract_similarity._io.resolve_atlas_file(base_dir, path_or_glob)[source]#

Resolve the warped atlas file, supporting either an explicit path or a glob.

Parameters:
  • base_dir (Path | str) – Patient output directory (used as the root for relative paths/globs).

  • path_or_glob (str) – Either an explicit relative path (e.g. "atlas_in_patient_space/atlas_mean_SyN_template_to_patient_in_patient_space.nii.gz") or a glob pattern (e.g. "atlas_in_patient_space/atlas_mean*.nii.gz").

Return type:

Path

Returns:

Resolved absolute path.

Raises:
thesis.workflows.tract_similarity._io.voxel_size_mm(affine)[source]#

Extract per-axis voxel spacing in mm from a NIfTI 4x4 affine.

Parameters:

affine (ndarray)

Return type:

Tuple[float, float, float]

thesis.workflows.tract_similarity.sweep#

Cohort-wide grid search over tract_similarity binarisation thresholds.

Loads every patient’s normalised probtrackx2 density and warped atlas once, evaluates Dice across the configured 2-D grid of (subject_threshold.value, atlas_threshold.value), aggregates Dice across the cohort per grid cell, and reports the cell that maximises the chosen aggregation (mean or median) along with a CSV per-patient table, a CSV grid summary, a JSON best-cell record, and an optional heatmap PNG.

Registered as thesis run -w tract_similarity_sweep -c <config>. Cohort one-shot — no per-patient fan-out, no IdentityInterface gate.

thesis.workflows.tract_similarity.sweep.build_sweep_workflow(*, config, context)[source]#

Build the cohort-wide tract_similarity threshold sweep workflow.

Parameters:
Return type:

Workflow

thesis.workflows.tract_similarity.sweep.verify_sweep_requirements(config, context)[source]#

Pre-flight checks for the cohort-wide sweep.

Parameters:
Return type:

List[str]

thesis.workflows.tract_similarity.hcp_loo#

Cohort-scope workflow: per-HCP-subject leave-one-out similarity vs the cohort atlas.

Produces, for each numeric (HCP) subject under the cohort output directory, a metrics.json (and optionally four NIfTI volumes) under <subject>/<tract_similarity.output_subdir>/ mirroring the per-patient tract_similarity workflow. The per-subject reference is a leave-one-out cohort mean atlas, built in memory by subtracting that subject’s volume from the precomputed stack sum.

Registered as thesis run -w tract_similarity_hcp_loo -c <profile>.

thesis.workflows.tract_similarity.hcp_loo.verify_requirements(config, context)[source]#

Pre-flight checks for tract_similarity_hcp_loo.

Returns a list of human-readable error strings; an empty list means the workflow is ready to build. The CLI surfaces these to the user verbatim.

Return type:

list

thesis.workflows.tract_similarity.hcp_loo.build_workflow(*, config, context)[source]#

Build the cohort-scope HCP-LOO tract similarity workflow.

Return type:

Workflow