paths — filesystem layout#

Schema: PathConfig in src/thesis/core/config/validators.py.

All values are validated through thesis.core.utils.to_path(), which expands ~ and $VAR/${VAR} and converts strings to pathlib.Path.

The three filesystem roots — inputs_dir, assets_dir, and output_dir — are independent: none is a parent of the others. A relative value is resolved against the current working directory at context-creation time, not nested under another root.

Note

input_dir is not a paths key. It is the per-subject input directory resolved at runtime by ProcessingContext as inputs_dir/<patient_id>. The hcp.* paths such as t1_image are resolved relative to it — see the hcp page.

Field

Type

Default

Description

inputs_dir

Path

data/raw

Per-subject source data root. The patient input directory is inputs_dir/<patient_id>. Used by --all discovery and the preprocess workflow. CLI override: --raw-data-dir.

assets_dir

Path

data

Shared, read-only assets (templates, atlases, ROIs, reference images). Anchors DataFile/DataDir lookups. CLI override: --data-dir.

output_dir

Path

outputs

Root for all workflow outputs/derivatives. The per-patient output directory is output_dir/<patient_id>; cohort outputs live under output_dir/cohort/. CLI override: --output-dir.

scratch_dir

Path | None

None

Optional temporary/scratch directory override. When unset, scratch defaults to output_dir/<patient_id>/temp. (Distinct from nipype.working_dir, the Nipype engine base_dir.)

log_dir

Path

logs

Directory for runtime log files. CLI override: --log-dir (top-level group flag).

scripts_dir

Path | None

None

Optional directory of user-supplied workflow scripts (*.py). When set, thesis list-workflows scans it and lists any scripts whose @workflow decorator registers successfully. See Custom workflows.

Example#

paths:
  inputs_dir: $HCP_DATA/raw      # per-subject source data (env var expansion)
  assets_dir: data              # shared templates / atlases / ROIs
  output_dir: outputs           # all derivatives + results
  log_dir: logs

Notes#

  • The three roots are independent. To relocate where the framework reads per-subject inputs, set inputs_dir (or --raw-data-dir) to a full path — it is no longer resolved underneath assets_dir/--data-dir.

  • The CLI’s --data-dir flag defaults to paths.assets_dir (falling back to data only if neither is set). It carries the shared-assets base only; it does not anchor input or output resolution.

  • Absolute config paths are honoured as-is. A path value that is absolute (or expands to one via ~ / $VAR) is used verbatim and is not anchored under input_dir / assets_dir — useful for pointing inputs (e.g. hcp.t1_image, registration.fixed_image, transforms.*) at a prior trial’s outputs or a shared template. The path-traversal guard only blocks relative ../ escapes.

  • When output_dir is unset in config it defaults to outputs, so ProcessingContext resolves the per-patient output to outputs/<patient_id> (relative to the working directory).

  • Cohort-level workflows (atlas, learned_atlas, tract_similarity_cohort, tract_similarity_hcp_loo, tract_similarity_sweep) scan under the resolved output base for per-subject outputs.