Configuration#
Hierarchical YAML configuration system built on Pydantic v2.
The package re-exports the most commonly used configuration classes, while the canonical API documentation lives on the concrete submodules below.
thesis.core.config.manager#
Configuration manager for hierarchical config loading and merging.
- class thesis.core.config.manager.ConfigManager[source]#
Bases:
objectManages configuration loading, merging, and validation.
Supports hierarchical configuration with defaults, environment-specific overrides, and patient-specific configs. Uses Pydantic for validation and type safety.
Config hierarchy (later configs override earlier): 1. Default config 2. Hardware config 3. Protocol config 4. Patient-specific config 5. Runtime overrides
Example
>>> manager = ConfigManager(config_dir="./config") >>> config = manager.load_config("default") >>> config = manager.add_overrides(config, patient_id="DTI_LDF001")
- load_config(config_name='default', patient_id=None, protocol=None, overrides=None, protocol_required=False)[source]#
Load a configuration with optional overrides.
- Parameters:
config_name (
str) – Name of the base config to loadpatient_id (
Optional[str]) – Optional patient ID for patient-specific configprotocol (
Optional[str]) – Optional protocol name for protocol-specific configoverrides (
Optional[Mapping[str,object]]) – Optional dictionary of runtime overridesprotocol_required (
bool) – If True, raiseConfigurationErrorwhenprotocolis given but no matching file exists. Set this when the protocol was explicitly requested by the user (e.g. via--protocol) rather than supplied as a workflow’sdefault_protocolfallback.
- Return type:
- Returns:
Validated PipelineConfig object
- Raises:
ConfigurationError – If the base
config_namefile is missing, or ifprotocol_requiredis True and the protocol file is missing.
Example
>>> config = manager.load_config( ... config_name="default", ... patient_id="DTI_LDF001", ... overrides={"preprocessing": {"threads": 8}} ... )
thesis.core.config.validators#
Pydantic models for configuration validation.
These models ensure that configurations have the correct structure and valid values before being used in processing pipelines.
- class thesis.core.config.validators.NormalizationMethod[source]#
-
Method for normalizing streamline density volumes before atlas combination.
- Variables:
WAYTOTAL – Divide by waytotal (total streamline count). Default method for the FSL ProbTrackX2 backend; produces values representing the fraction of total streamlines passing through each voxel.
MAX – Divide by the maximum voxel value in the volume. Scales each volume to [0, 1] range.
SOFTMAX – Apply softmax normalization (exp(x) / sum(exp(x))). Produces a probability distribution that sums to 1.0 across the volume.
STREAMLINE_DENSITY – MRtrix3 counterpart of
WAYTOTAL. Divides by the value stored inwaytotal, which the MRtrix3 workflow writes as the sum of SIFT2 per-streamline weights when SIFT2 is enabled (matching the SIFT2-weighted TDI numerator) or the raw streamline count when SIFT2 is off. Produces a streamline fraction in [0, 1].
- WAYTOTAL = 'waytotal'#
- MAX = 'max'#
- SOFTMAX = 'softmax'#
- STREAMLINE_DENSITY = 'streamline_density'#
- __new__(value)#
- class thesis.core.config.validators.BaseConfig[source]#
Bases:
BaseModelBase config model that rejects unknown fields to catch typos early.
- Parameters:
data (
Any)
- class thesis.core.config.validators.PathConfig[source]#
Bases:
BaseConfigConfiguration for file paths and directories.
The three read-only/output roots are independent — none is a parent of the others:
- Variables:
inputs_dir – Per-patient source data root. The patient’s input directory is
inputs_dir / <patient_id>.assets_dir – Cohort-shared, read-only assets (templates, atlases, ROIs, reference images). Anchors
DataFile/DataDirlookups.output_dir – Root for all workflow outputs/derivatives. The patient’s output directory is
output_dir / <patient_id>.scratch_dir – Optional temporary/scratch directory override. When unset, scratch defaults to
output_dir / <patient_id> / temp.log_dir – Directory for runtime logs.
scripts_dir – Optional directory of user-supplied workflow scripts (
*.py).
- Parameters:
data (
Any)
- class thesis.core.config.validators.HardwareConfig[source]#
Bases:
BaseConfigConfiguration for hardware and computational resources.
- Variables:
threads – CPU threads available to workflow execution.
memory_gb – Memory budget in gigabytes.
gpu_enabled – Whether GPU-aware execution is enabled.
gpu_device – Optional GPU device index.
n_gpu_procs – Number of scheduler GPU worker slots.
n_gpus – Physical GPUs exposed per worker slot.
- Parameters:
data (
Any)
- class thesis.core.config.validators.AtlasConfig[source]#
Bases:
BaseConfigAtlas workflow configuration.
- Parameters:
data (
Any)
- normalization_method: NormalizationMethod#
- class thesis.core.config.validators.AtlasQCConfig[source]#
Bases:
BaseConfigConfiguration for atlas workflow QC outputs.
- Variables:
enabled – Whether atlas QC generation is enabled.
generate_group_plots – Whether cohort-level atlas summary plots are requested.
generate_patient_reports – Whether per-patient atlas QC outputs are requested.
outlier_sd_threshold – Number of standard deviations from the cohort mean used to flag atlas-derived outliers.
subject_density_threshold – Minimum normalized streamline density for a subject voxel to be considered present.
group_core_threshold – Minimum proportion of subjects that must have a valid streamline (above subject_density_threshold) for a voxel to be part of the core.
leave_one_out – Whether to compute leave-one-out statistics for each subject.
leave_one_out_min_subjects – Minimum number of subjects required to run leave-one-out validation.
compute_cv – Whether to compute Coefficient of Variation (CV) maps.
cv_mean_threshold – Fraction of atlas mean maximum required to compute CV.
compute_log_stats – Whether to compute statistics in log space.
log_offset – Small offset added to zero values before log transformation.
- Parameters:
data (
Any)
- class thesis.core.config.validators.S3Config[source]#
Bases:
BaseConfigConfiguration for S3 data download.
- Variables:
enabled – Whether S3 download is enabled.
bucket – S3 bucket name.
region – AWS region.
prefix – Bucket prefix (e.g., ‘HCP_1200’).
cache_policy – Download behavior when files exist locally.
max_retries – Maximum retry attempts for failed downloads.
retry_backoff – Exponential backoff multiplier for retries.
required_patterns – File patterns that must be downloaded.
optional_patterns – File patterns that should be downloaded if available.
- Parameters:
data (
Any)
- class thesis.core.config.validators.NipypeConfig[source]#
Bases:
BaseConfigConfiguration for Nipype workflow execution.
- Variables:
working_dir – Base working directory.
crash_dir – Optional crash dump directory.
plugin – Nipype execution plugin.
plugin_args – Plugin arguments passed to
workflow.run.stop_on_first_crash – Whether execution stops after the first crash.
remove_unnecessary_outputs – Whether intermediate outputs are cleaned.
keep_inputs – Whether node input files remain in the working directory.
hash_method – Caching hash strategy.
use_profiler – Whether the Nipype profiler is enabled.
- Parameters:
data (
Any)
- class thesis.core.config.validators.PreprocessingConfig[source]#
Bases:
BaseConfigConfiguration for preprocessing pipeline.
- Variables:
denoise – Whether denoising is enabled.
bias_correction – Whether bias correction is enabled.
brain_extraction – Whether brain extraction is enabled.
brain_extraction_method – Selected extraction backend.
motion_correction – Whether motion correction is enabled.
eddy_correction – Whether eddy-current correction is enabled.
topup – Whether TOPUP distortion correction is enabled.
acq_params – Optional TOPUP acquisition parameters.
- Parameters:
data (
Any)
- class thesis.core.config.validators.FireantsRegistrationConfig[source]#
Bases:
BaseConfigConfiguration for the FireANTs registration backend.
- Variables:
device – Torch device string used by FireANTs.
scales – Multi-resolution registration scales.
affine_iterations – Iterations per scale for affine registration.
deformable_iterations – Iterations per scale for deformable registration.
optimizer – FireANTs optimizer name.
affine_lr – Learning rate for affine registration.
deformable_lr – Learning rate for deformable registration.
cc_kernel_size – Cross-correlation kernel size.
deformation_type – FireANTs deformation model.
dtype – Torch dtype name.
loss_type – Similarity metric for the staged registration (cc/mi/mse/ fusedcc/fusedmi); auto-fused on CUDA.
normalize – Min-max normalize intensities to [0, 1] before registration.
- Parameters:
data (
Any)
- class thesis.core.config.validators.RegistrationViewerConfig[source]#
Bases:
BaseConfigConfiguration for registration QC viewer launching.
- Variables:
enabled – Whether viewer launching is enabled.
backend – Viewer backend name.
auto_open – Whether the viewer should be launched automatically.
overlay_opacity – Default overlay opacity for the transformed patient image.
- Parameters:
data (
Any)
- class thesis.core.config.validators.RegistrationJobConfig[source]#
Bases:
BaseConfigConfiguration for a single named registration job.
A job overrides the shared
RegistrationConfigdefaults for one patient-to-template registration. All override fields default toNone(meaning “inherit the shared value”); onlynameis required. Thefireantsfield, when set, is a sparse dict merged ontoregistration.fireantsviamodel_copy(update=...)(which re-validates).- Variables:
name – Unique job name used in node/output naming and referenced by
transforms.jobs[*].from_registration.method – Optional registration-backend override.
moving_modality – Optional moving-image modality override.
moving_image – Optional explicit moving-image path override.
fixed_image – Optional fixed-image path override.
interpolation – Optional interpolation-mode override.
metric – Optional similarity-metric override.
transform_type – Optional transform-family override.
use_float – Optional float-precision override.
fireants – Optional sparse FireANTs override block.
- Parameters:
data (
Any)
- classmethod validate_method(v)[source]#
Validate the per-job registration method (None inherits the shared value).
- classmethod validate_moving_modality(v)[source]#
Validate the per-job moving-image modality (None inherits the shared value).
- classmethod validate_interpolation(v)[source]#
Validate the per-job interpolation method (None inherits the shared value).
- class thesis.core.config.validators.RegistrationConfig[source]#
Bases:
BaseConfigConfiguration for image registration.
- Variables:
enabled – Whether the registration workflow is enabled.
method – Registration backend.
moving_modality – Structural modality used as the moving image.
moving_image – Optional explicit moving-image override.
fixed_image – Template-space fixed image.
interpolation – Interpolation mode.
metric – Similarity metric.
transform_type – Transform family.
use_float – Whether float precision is used.
collapse_output_transforms – Whether ANTs should collapse output transforms.
write_composite_transform – Whether ANTs should emit composite transforms.
output_subdir – Output subdirectory under the patient output dir.
fireants – FireANTs backend settings.
viewer – Registration QC viewer settings.
- Parameters:
data (
Any)
- fireants: FireantsRegistrationConfig#
- viewer: RegistrationViewerConfig#
- jobs: List[RegistrationJobConfig]#
- class thesis.core.config.validators.SegmentationConfig[source]#
Bases:
BaseConfigConfiguration for segmentation.
- Variables:
method – Segmentation backend.
tissue_types – Tissue classes to segment.
create_masks – Whether binary masks are emitted.
labels – Optional label-name to integer mapping.
- Parameters:
data (
Any)
- class thesis.core.config.validators.AtlasSourceConfig[source]#
Bases:
BaseConfigConfiguration for one atlas label-map source.
- Variables:
name – Unique source name.
roi_file – Template-space atlas image path.
transform – Optional named transform to use for this source.
label_file – Optional label CSV path.
waypoint_labels – ROI extraction mapping for this source.
- Parameters:
data (
Any)
- class thesis.core.config.validators.AtlasTransformConfig[source]#
Bases:
BaseConfigNamed atlas-to-patient transform configuration.
- Variables:
template_to_patient – One transform path or an ordered transform chain.
reference_image – Patient-space reference image for resampling.
- Parameters:
data (
Any)
- class thesis.core.config.validators.TransformJobConfig[source]#
Bases:
BaseConfigConfiguration for a single image transformation job.
A job maps a set of input images to a transformed output using a named transform direction from
TransformsConfig. Multiple jobs can be defined to process different sets of images (e.g. mean maps, std maps, probability maps) in a single workflow run.- Variables:
name – Unique job name used in output filenames and Nipype node names.
input_files – Explicit paths to the images to transform. Each path may contain
{patient_id}which is substituted at runtime.direction – Transform direction —
template_to_patientusestransforms.template_to_patient/transforms.reference_image;patient_to_templateusestransforms.patient_to_template/transforms.template_reference_image.interpolation – ANTs interpolation method applied to every input image.
output_subdir – Subdirectory under the patient output directory where transformed images are written.
- Parameters:
data (
Any)
- class thesis.core.config.validators.TransformsConfig[source]#
Bases:
BaseConfigConfiguration for pre-computed transforms.
- Variables:
base_dir – Optional base directory prepended to relative transform paths.
patient_to_template – Patient-to-template warp path.
template_to_patient – Template-to-patient transform or transform chain.
reference_image – Patient-space reference image.
template_reference_image – Template-space reference image.
atlas_transforms – Named atlas-specific transform configurations.
jobs – Transform jobs executed by the standalone
transformworkflow.
- Parameters:
data (
Any)
- atlas_transforms: Dict[str, AtlasTransformConfig]#
- jobs: List[TransformJobConfig]#
- class thesis.core.config.validators.SynthSegConfig[source]#
Bases:
BaseConfigConfiguration for standalone or embedded SynthSeg execution.
- Variables:
t1_image – Optional explicit T1-weighted input path.
parc – Whether cortical parcellation output is requested.
robust – Whether robust mode is enabled.
fast – Whether fast mode is enabled.
vol – Whether a volumes CSV is written.
qc – Whether a QC CSV is written.
crop – Optional crop size.
cpu – Whether CPU mode is forced.
threads – CPU thread count used in CPU mode.
- Parameters:
data (
Any)
- class thesis.core.config.validators.TractographyConfig[source]#
Bases:
BaseConfigConfiguration for tractography.
- Variables:
method – Tractography backend.
run_tractography – Optional workflow-level execution toggle.
n_samples – Streamlines per seed voxel.
n_steps – Maximum steps per streamline.
step_length – Step length in millimetres.
curvature_threshold – Maximum curvature threshold.
hemisphere – Which hemisphere(s) to run tractography for.
atlas_sources – Optional atlas ROI source list.
roi_labels – Legacy single-atlas ROI configuration.
synthseg_roi_labels – Optional SynthSeg-derived ROI configuration.
force_dir – Whether ProbTrackX2 uses
--forcedir.opd – Whether ProbTrackX2 writes orientation distribution outputs.
- Parameters:
data (
Any)
- atlas_sources: List[AtlasSourceConfig] | None#
- classmethod warn_unprefixed_gpu_runtime_env(v)[source]#
Warn on keys that will not survive
singularity exec --cleanenv.Only
SINGULARITYENV_/APPTAINERENV_-prefixed variables are forwarded into the container past--cleanenv. A bare key (e.g.LD_PRELOAD) would be set in the wrapper’s host shell and then silently stripped, so it would never reach probtrackx2 — a confusing no-op we surface here rather than letting it fail at runtime.
- class thesis.core.config.validators.ValidationConfig[source]#
Bases:
BaseConfigConfiguration for post-processing validation checks.
- Variables:
check_rois – Whether warped ROI validation is enabled.
min_voxels – Minimum non-zero voxel count per ROI.
singularity_threshold – Minimum affine determinant magnitude.
volume_ratio_min – Lower acceptable warped-to-original voxel ratio.
volume_ratio_max – Upper acceptable warped-to-original voxel ratio.
- Parameters:
data (
Any)
- class thesis.core.config.validators.QCConfig[source]#
Bases:
BaseConfigConfiguration for QC visualisation outputs.
Controls automatic generation of quality-control overlay images (ROI placement on anatomical backgrounds, track density maps on template images) after workflow execution, as well as extended QC checks (SynthSeg quality, outlier detection, etc.).
- Variables:
generate_overlays – Whether to produce ROI overlay PNGs at the end of the HCP workflow.
track_density_thresholds – Percentile thresholds used when rendering track density figures (applied to non-zero voxels of fdt_paths.nii.gz).
synthseg_qc_threshold – Minimum acceptable SynthSeg QC score. Subjects below this threshold are flagged.
outlier_sd_threshold – Number of standard deviations from the batch mean to flag a subject as an outlier.
- Parameters:
data (
Any)
- class thesis.core.config.validators.OutputSettingsConfig[source]#
Bases:
BaseConfigConfiguration for CLI output behavior (YAML-level defaults).
CLI flags
-v,-q,--summary,--no-progressoverride values set here at runtime.- Variables:
verbosity – Default verbosity level (
"quiet","normal","verbose").summary – Summary detail (
"off","compact","full").progress – Whether to show progress bars/spinners.
"auto"enables progress for TTY and disables for pipes/CI.output_format – Output format (
"human","json").
- Parameters:
data (
Any)
- class thesis.core.config.validators.SideThresholdConfig[source]#
Bases:
BaseConfigBinarisation threshold for one side (subject or atlas) of a tract comparison.
- Variables:
mode –
"fraction"appliesvalue * max(volume)as the cutoff;"absolute"usesvaluedirectly as a raw voxel-intensity cutoff.value – Threshold value. Must be in
(0, 1)whenmode="fraction"; any positive number whenmode="absolute".
- Parameters:
data (
Any)
- class thesis.core.config.validators.HcpLooConfig[source]#
Bases:
BaseConfigConfiguration for the tract_similarity_hcp_loo workflow.
Controls the per-HCP-subject leave-one-out comparison against the cohort atlas. LOO is mathematically valid at N=2 (the LOO atlas becomes the other subject’s volume), but a floor of 3 enforces a meaningful cohort reference.
- Variables:
minimum_subjects – Minimum cohort size required to run the workflow.
write_volumes – If True, write the four NIfTI volumes per subject (
subject_normalized.nii.gz,atlas_normalized.nii.gz,subject_mask.nii.gz,atlas_mask.nii.gz) alongsidemetrics.json. If False, onlymetrics.jsonis written.learned_prediction_relpath – Optional relative path under each subject directory to a learned-atlas per-subject prediction NIfTI (template space). Empty (default) disables the learned-template 3rd arm.
learned_support_threshold – Density threshold defining the held-out subject’s true support for the non-circular learned-arm metrics.
- Parameters:
data (
Any)
- class thesis.core.config.validators.TractSimilarityConfig[source]#
Bases:
BaseConfigConfiguration for the tract_similarity analysis workflow.
Controls the per-patient comparison between the native-space probtrackx2 tract density and the warped cohort mean atlas, plus the cohort-level aggregation of those per-patient metrics.
- Variables:
fdt_relpath – Relative path under the patient output directory to the native-space
fdt_paths.nii.gz.waytotal_relpath – Relative path to the
waytotaltext file.atlas_relpath – Relative path (or glob pattern) under the patient output directory that points at the warped atlas mean volume.
subject_threshold – Binarisation threshold applied to the subject’s probtrackx2 volume before overlap / distance metrics.
atlas_threshold – Binarisation threshold applied to the warped atlas volume before overlap / distance metrics.
n_bins – Histogram bin count for normalised mutual information.
output_subdir – Subdirectory under the patient output directory where
metrics.jsonis written.cohort_output_subdir – Subdirectory under the cohort output directory where the aggregated summary is written.
- Parameters:
data (
Any)
- subject_threshold: SideThresholdConfig#
- atlas_threshold: SideThresholdConfig#
- hcp_loo: HcpLooConfig#
- class thesis.core.config.validators.ThresholdGridConfig[source]#
Bases:
BaseConfigGrid specification for one binarisation-threshold axis of the sweep.
Provide either
start/stop/step(inclusive endpoint within float tolerance) or an explicitvalueslist. All entries must lie in(0, 1)because the sweep operates in"fraction"mode.- Parameters:
data (
Any)
- class thesis.core.config.validators.TractSimilaritySweepConfig[source]#
Bases:
BaseConfigConfiguration for the
tract_similarity_sweepcohort grid search.Sweeps the
subject_threshold.valueandatlas_threshold.valueknobs (both in"fraction"mode) and reports the cell that maximises the chosen aggregation of Dice across the cohort.- Parameters:
data (
Any)
- subject_threshold_grid: ThresholdGridConfig#
- atlas_threshold_grid: ThresholdGridConfig#
- class thesis.core.config.validators.HCPConfig[source]#
Bases:
BaseConfigConfiguration for HCP preprocessed data.
- Variables:
diffusion_dir – Diffusion subdirectory under the subject input directory.
bedpostx_dir – BedpostX output directory.
t1_image – T1-weighted image path.
t2_image – Optional T2-weighted image path.
n_fibers – Number of BedpostX fibres.
mask_name – Default diffusion brain-mask filename.
mask_path – Optional full mask-path override.
- Parameters:
data (
Any)
- class thesis.core.config.validators.PipelineConfig[source]#
Bases:
BaseModelRoot configuration model for thesis pipelines.
Aggregates all sub-configurations and provides convenience methods.
- Variables:
paths – Filesystem paths.
hardware – Compute resources.
atlas – Atlas generation settings.
s3 – S3 data download settings (optional).
preprocessing – Generic preprocessing steps.
preprocess – Workflow-specific raw-to-HCP preprocessing settings.
registration – Image registration.
segmentation – Brain segmentation.
tractography – Tractography parameters.
hcp – HCP-specific overrides.
transforms – Pre-computed transform settings.
nipype – Nipype execution settings.
validation – ROI validation settings.
qc – QC generation settings.
atlas_qc – Atlas QC generation settings.
synthseg – Optional SynthSeg-specific overrides.
output – CLI output defaults.
patient_id – Optional patient identifier.
protocol – Optional protocol name.
- Parameters:
data (
Any)
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow'}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- paths: PathConfig#
- hardware: HardwareConfig#
- atlas: AtlasConfig#
- preprocessing: PreprocessingConfig#
- registration: RegistrationConfig#
- segmentation: SegmentationConfig#
- tractography: TractographyConfig#
- transforms: TransformsConfig#
- nipype: NipypeConfig#
- slurm: SLURMConfig#
- validation: ValidationConfig#
- atlas_qc: AtlasQCConfig#
- synthseg: SynthSegConfig | None#
- output: OutputSettingsConfig#
- tract_similarity: TractSimilarityConfig#
- tract_similarity_sweep: TractSimilaritySweepConfig#
- classmethod from_dict(config_dict)[source]#
Create a PipelineConfig from a dictionary.
Convenience wrapper around
model_validatethat provides a domain-specific name and ensures the return type is narrowed.- Parameters:
config_dict (
Dict[str,Any]) – Dictionary of configuration values- Return type:
- Returns:
Validated PipelineConfig object
Example
>>> config_dict = {"hardware": {"threads": 8}} >>> config = PipelineConfig.from_dict(config_dict)
- merge_with(other)[source]#
Merge this config with another config or dict.
- Parameters:
other (
Union[PipelineConfig,Dict[str,Any]]) – Another PipelineConfig or dict to merge- Return type:
- Returns:
New PipelineConfig with merged values
Example
>>> merged = config.merge_with({"hardware": {"threads": 16}})
thesis.core.config.loaders#
YAML configuration loaders and utilities.
- thesis.core.config.loaders.load_yaml(file_path)[source]#
Load a YAML configuration file.
- Parameters:
- Return type:
- Returns:
Dictionary with config data
- Raises:
FileNotFoundError – If file doesn’t exist
ValueError – If YAML is invalid
Example
>>> config = load_yaml("./config/default.yaml") >>> print(config["preprocessing"]["threads"])
- thesis.core.config.loaders.save_yaml(config, file_path)[source]#
Save a configuration to a YAML file.
- Parameters:
- Return type:
Example
>>> config = {"threads": 4} >>> save_yaml(config, "./my_config.yaml")
- thesis.core.config.loaders.merge_configs(*configs)[source]#
Merge multiple configurations with later configs overriding earlier ones.
Deep merges dictionaries, with later configs taking precedence.
- Parameters:
*configs (
dict[str,object]) – Variable number of config dictionaries to merge- Return type:
- Returns:
Merged dictionary
Example
>>> base = {"a": 1, "b": {"x": 2}} >>> override = {"b": {"y": 3}, "c": 4} >>> merged = merge_configs(base, override) >>> print(merged["b"]) # {"x": 2, "y": 3}