preprocessing and preprocess — preprocessing knobs#

The framework has two related top-level keys:

  • preprocessingPreprocessingConfig (generic toggles, validated)

  • preprocessDict[str, Any] (workflow-specific config for the preprocess workflow, loosely typed to avoid a model-import cycle)

The preprocess workflow reads both: high-level toggles from preprocessing and detailed step config from preprocess. The detailed nested models live in src/thesis/workflows/preprocess/config.py.

preprocessing — generic toggles#

Schema: PreprocessingConfig in src/thesis/core/config/validators.py.

Field

Type

Default

Constraints

Description

denoise

bool

True

Apply denoising before further preprocessing.

bias_correction

bool

True

Apply N4-style bias-field correction.

brain_extraction

bool

True

Perform brain extraction on structural images.

brain_extraction_method

str

"synthstrip"

one of bet / synthstrip / ants

Brain-extraction backend.

motion_correction

bool

True

Apply motion correction.

eddy_correction

bool

True

Apply FSL eddy-current correction.

topup

bool

True

Run FSL TOPUP for susceptibility-distortion correction.

acq_params

Dict[str, Any] | None

None

Optional acquisition parameters passed to TOPUP (see workflow-specific preprocess.acq_params below for the typed equivalent).

preprocess — workflow-specific blocks#

The preprocess key holds a free-form dict. Each subkey corresponds to a typed Pydantic model in src/thesis/workflows/preprocess/config.py. Use these models as reference when authoring config/preprocess.yaml.

preprocess.acq_paramsAcqParamsConfig#

Field

Type

Description

bandwidth

float

Effective readout bandwidth (Hz/Px).

phase_encoding_dirs

int

Number of phase-encoding lines; used to derive the readout time.

create_acqparams_file derives the readout time from bandwidth and phase_encoding_dirs and emits fixed AP/PA encoding rows, so the previous readout_time, echo_spacing, phase_encoding_direction_ap, and phase_encoding_direction_pa fields have been removed.

preprocess.betBetConfig#

Brain-extraction (FSL BET) options. These apply only when preprocessing.brain_extraction_method is bet; the SynthStrip path ignores them.

Field

Type

Description

frac_dwi / frac_t1 / frac_t2

float

Fractional intensity threshold per modality (0–1).

robust

bool

Robust brain-center estimation (BET -R).

padding

bool

Pad end slices to improve BET on small FOVs (BET -Z). Mutually exclusive with robust; when both are set, robust is used and padding is ignored.

radius

int | None

Head radius in mm (BET -r); None lets BET estimate it.

preprocess.bedpostxBedpostXConfig#

BedpostX fibre-orientation estimation options. Authoritative defaults live in src/thesis/workflows/preprocess/config.py (BedpostXConfig).

Field

Type

Default

Constraints

Description

n_fibres

int

2

1 x 3

Number of fibre populations to model per voxel.

model

int

1

1 x 3

Deconvolution model (1 = single-shell, 2 = multi-shell, 3 = ball-and-sticks).

burn_in

int

1000

0

MCMC burn-in jumps.

n_jumps

int

1250

1

Total MCMC jumps.

sample_every

int

25

1

Record a sample every Nth jump.

use_gpu

bool

False

Use CUDA-enabled BedpostX.

weight

float

1.0

0.0

ARD weight (only used by model: 3).

preprocess.synthsegPreprocessSynthSegConfig#

SynthSeg-specific options as run inside the preprocess workflow (separate from the top-level synthseg block which configures the standalone workflow).

preprocess workflow-control toggles#

The preprocess dict also carries top-level boolean toggles (defined on PreprocessConfig in src/thesis/workflows/preprocess/config.py) that gate whole stages of the pipeline:

Key path

Type

Default

Description

preprocess.run_topup

bool

True

Run FSL TOPUP distortion correction.

preprocess.run_eddy

bool

True

Run FSL eddy-current correction.

preprocess.run_dtifit

bool

True

Run DTIFit tensor estimation.

preprocess.run_bedpostx

bool

True

Run BedpostX fibre-orientation estimation.

preprocess.run_synthseg

bool

True

Run SynthSeg segmentation.

preprocess.run_coregistration

bool

True

Run the intra-subject coregistration chain (dwi_to_t1, t2_to_t1, and MNI-label warps). This is the single authoritative toggle referenced by the registration callout; it is distinct from the top-level patient→template registration block. (Previously named run_registration.)

preprocess.registrationRegistrationChainConfig#

ANTs registration steps embedded in the preprocess pipeline. Each entry is a RegistrationStepConfig (transform type, metric, interpolation, etc.). The whole chain is gated by preprocess.run_coregistration (see the toggles table above):

  • dwi_to_t1 — intra-subject DWI→T1 (Rigid). Always built when preprocess.run_coregistration is enabled; its composite transform is exported as the t1_to_dwi_transform output.

  • t2_to_t1 — intra-subject T2→T1 (Rigid). Built when a T2 image is present.

  • t1_to_template — patient T1→template (SyN). Built when a template fixed image is configured via the top-level registration.fixed_image (reused from the registration workflow). This lets a patient that was not used to build the cohort template still be registered to it. Its outputs are exported on the preprocess outputnode as t1_to_template_transform (composite transform) and t1_to_template_warped (T1 resampled onto the template grid).

preprocess.label_transformLabelTransformConfig#

Warps MNI/atlas labels into subject space via the dwi_to_t1 composite transform (built only when transform_mni_labels is true and mni_labels_list is non-empty):

Field

Type

Description

transform_mni_labels

bool

Enable the per-label warp_labels MapNode.

mni_labels_list

List[str]

Label images to warp.

interpolation

str

ANTs interpolation (NearestNeighbor default for discrete labels; also Linear, BSpline, MultiLabel).

use_inverse_warp

bool

Invert the composite transform (mapped to ANTs invert_transform_flags).

use_hcp_template

bool

When true, warp labels onto the configured template (registration.fixed_image) grid; otherwise onto the DWI b0 grid.

preprocess.dtifitDTIFitConfig#

FSL dtifit tensor-estimation options. Authoritative defaults live in src/thesis/workflows/preprocess/config.py (DTIFitConfig).

Field

Type

Default

Description

use_wls

bool

True

Use weighted least-squares fitting.

compute_kurt

bool

False

Compute kurtosis (DKI) parameters.

save_tensor

bool

True

Save the full tensor components.

Example#

preprocessing:
  denoise: true
  bias_correction: true
  brain_extraction: true
  brain_extraction_method: synthstrip
  motion_correction: true
  eddy_correction: true
  topup: true

preprocess:
  acq_params:
    bandwidth: 1923.077
    phase_encoding_dirs: 96
  bet:
    frac_t1: 0.5
    robust: true
    radius: 80
  bedpostx:
    n_fibres: 3
    model: 2
    burn_in: 1000
    n_jumps: 1250
    sample_every: 25
    use_gpu: true
  synthseg:
    parc: true
    robust: true
  dtifit:
    use_wls: true
    save_tensor: true
  run_coregistration: true

See config/preprocess.yaml for the shipped reference protocol.