`preprocessing` and `preprocess` — preprocessing knobs#

The framework has two related top-level keys:

preprocessing — PreprocessingConfig (generic toggles, validated)
preprocess — Dict[str, Any] (workflow-specific config for the preprocess workflow, loosely typed to avoid a model-import cycle)

The preprocess workflow reads both: high-level toggles from preprocessing and detailed step config from preprocess. The detailed nested models live in src/thesis/workflows/preprocess/config.py.

`preprocessing` — generic toggles#

Schema: PreprocessingConfig in src/thesis/core/config/validators.py.

Field	Type	Default	Constraints	Description
`denoise`	`bool`	`True`	—	Apply denoising before further preprocessing.
`bias_correction`	`bool`	`True`	—	Apply N4-style bias-field correction.
`brain_extraction`	`bool`	`True`	—	Perform brain extraction on structural images.
`brain_extraction_method`	`str`	`"synthstrip"`	one of `bet` / `synthstrip` / `ants`	Brain-extraction backend.
`motion_correction`	`bool`	`True`	—	Apply motion correction.
`eddy_correction`	`bool`	`True`	—	Apply FSL eddy-current correction.
`topup`	`bool`	`True`	—	Run FSL TOPUP for susceptibility-distortion correction.
`acq_params`	`Dict[str, Any] \| None`	`None`	—	Optional acquisition parameters passed to TOPUP (see workflow-specific `preprocess.acq_params` below for the typed equivalent).

`preprocess` — workflow-specific blocks#

The preprocess key holds a free-form dict. Each subkey corresponds to a typed Pydantic model in src/thesis/workflows/preprocess/config.py. Use these models as reference when authoring config/preprocess.yaml.

`preprocess.acq_params` — `AcqParamsConfig`#

Field	Type	Description
`bandwidth`	`float`	Effective readout bandwidth (Hz/Px).
`phase_encoding_dirs`	`int`	Number of phase-encoding lines; used to derive the readout time.

create_acqparams_file derives the readout time from bandwidth and phase_encoding_dirs and emits fixed AP/PA encoding rows, so the previous readout_time, echo_spacing, phase_encoding_direction_ap, and phase_encoding_direction_pa fields have been removed.

`preprocess.bet` — `BetConfig`#

Brain-extraction (FSL BET) options. These apply only when preprocessing.brain_extraction_method is bet; the SynthStrip path ignores them.

Field	Type	Description
`frac_dwi` / `frac_t1` / `frac_t2`	`float`	Fractional intensity threshold per modality (0–1).
`robust`	`bool`	Robust brain-center estimation (BET `-R`).
`padding`	`bool`	Pad end slices to improve BET on small FOVs (BET `-Z`). Mutually exclusive with `robust`; when both are set, `robust` is used and `padding` is ignored.
`radius`	`int \| None`	Head radius in mm (BET `-r`); `None` lets BET estimate it.

`preprocess.bedpostx` — `BedpostXConfig`#

BedpostX fibre-orientation estimation options. Authoritative defaults live in src/thesis/workflows/preprocess/config.py (BedpostXConfig).

Field	Type	Default	Constraints	Description
`n_fibres`	`int`	`2`	`1 ≤ x ≤ 3`	Number of fibre populations to model per voxel.
`model`	`int`	`1`	`1 ≤ x ≤ 3`	Deconvolution model (`1` = single-shell, `2` = multi-shell, `3` = ball-and-sticks).
`burn_in`	`int`	`1000`	`≥ 0`	MCMC burn-in jumps.
`n_jumps`	`int`	`1250`	`≥ 1`	Total MCMC jumps.
`sample_every`	`int`	`25`	`≥ 1`	Record a sample every Nth jump.
`use_gpu`	`bool`	`False`	—	Use CUDA-enabled BedpostX.
`weight`	`float`	`1.0`	`≥ 0.0`	ARD weight (only used by `model: 3`).

`preprocess.synthseg` — `PreprocessSynthSegConfig`#

SynthSeg-specific options as run inside the preprocess workflow (separate from the top-level synthseg block which configures the standalone workflow).

`preprocess` workflow-control toggles#

The preprocess dict also carries top-level boolean toggles (defined on PreprocessConfig in src/thesis/workflows/preprocess/config.py) that gate whole stages of the pipeline:

Key path	Type	Default	Description
`preprocess.run_topup`	`bool`	`True`	Run FSL TOPUP distortion correction.
`preprocess.run_eddy`	`bool`	`True`	Run FSL eddy-current correction.
`preprocess.run_dtifit`	`bool`	`True`	Run DTIFit tensor estimation.
`preprocess.run_bedpostx`	`bool`	`True`	Run BedpostX fibre-orientation estimation.
`preprocess.run_synthseg`	`bool`	`True`	Run SynthSeg segmentation.
`preprocess.run_coregistration`	`bool`	`True`	Run the intra-subject coregistration chain (`dwi_to_t1`, `t2_to_t1`, and MNI-label warps). This is the single authoritative toggle referenced by the `registration` callout; it is distinct from the top-level patient→template `registration` block. (Previously named `run_registration`.)

`preprocess.registration` — `RegistrationChainConfig`#

ANTs registration steps embedded in the preprocess pipeline. Each entry is a RegistrationStepConfig (transform type, metric, interpolation, etc.). The whole chain is gated by preprocess.run_coregistration (see the toggles table above):

dwi_to_t1 — intra-subject DWI→T1 (Rigid). Always built when preprocess.run_coregistration is enabled; its composite transform is exported as the t1_to_dwi_transform output.
t2_to_t1 — intra-subject T2→T1 (Rigid). Built when a T2 image is present.
t1_to_template — patient T1→template (SyN). Built when a template fixed image is configured via the top-level registration.fixed_image (reused from the registration workflow). This lets a patient that was not used to build the cohort template still be registered to it. Its outputs are exported on the preprocess outputnode as t1_to_template_transform (composite transform) and t1_to_template_warped (T1 resampled onto the template grid).

`preprocess.label_transform` — `LabelTransformConfig`#

Warps MNI/atlas labels into subject space via the dwi_to_t1 composite transform (built only when transform_mni_labels is true and mni_labels_list is non-empty):

Field	Type	Description
`transform_mni_labels`	`bool`	Enable the per-label `warp_labels` MapNode.
`mni_labels_list`	`List[str]`	Label images to warp.
`interpolation`	`str`	ANTs interpolation (`NearestNeighbor` default for discrete labels; also `Linear`, `BSpline`, `MultiLabel`).
`use_inverse_warp`	`bool`	Invert the composite transform (mapped to ANTs `invert_transform_flags`).
`use_hcp_template`	`bool`	When true, warp labels onto the configured template (`registration.fixed_image`) grid; otherwise onto the DWI b0 grid.

`preprocess.dtifit` — `DTIFitConfig`#

FSL dtifit tensor-estimation options. Authoritative defaults live in src/thesis/workflows/preprocess/config.py (DTIFitConfig).

Field	Type	Default	Description
`use_wls`	`bool`	`True`	Use weighted least-squares fitting.
`compute_kurt`	`bool`	`False`	Compute kurtosis (DKI) parameters.
`save_tensor`	`bool`	`True`	Save the full tensor components.

Example#

preprocessing:
  denoise: true
  bias_correction: true
  brain_extraction: true
  brain_extraction_method: synthstrip
  motion_correction: true
  eddy_correction: true
  topup: true

preprocess:
  acq_params:
    bandwidth: 1923.077
    phase_encoding_dirs: 96
  bet:
    frac_t1: 0.5
    robust: true
    radius: 80
  bedpostx:
    n_fibres: 3
    model: 2
    burn_in: 1000
    n_jumps: 1250
    sample_every: 25
    use_gpu: true
  synthseg:
    parc: true
    robust: true
  dtifit:
    use_wls: true
    save_tensor: true
  run_coregistration: true

See config/preprocess.yaml for the shipped reference protocol.

preprocessing and preprocess — preprocessing knobs#

preprocessing — generic toggles#

preprocess — workflow-specific blocks#

preprocess.acq_params — AcqParamsConfig#

preprocess.bet — BetConfig#

preprocess.bedpostx — BedpostXConfig#

preprocess.synthseg — PreprocessSynthSegConfig#

preprocess workflow-control toggles#

preprocess.registration — RegistrationChainConfig#

preprocess.label_transform — LabelTransformConfig#

preprocess.dtifit — DTIFitConfig#