Best Practices for This Framework#

Code Organization#

1. Module Structure#

module_name/
├── __init__.py        # Public API
├── module.py          # Main implementation
├── utils.py           # Helper functions
└── exceptions.py      # Custom exceptions

2. Naming Conventions#

Classes: PascalCase (e.g., PreprocessingStep)
Functions: snake_case (e.g., execute_step)
Constants: UPPER_SNAKE_CASE (e.g., MAX_THREADS)
Private: Leading underscore (e.g., _internal_method)
Modules: snake_case (e.g., preprocessing.py)

3. Imports#

# Standard library
import sys
import json
from pathlib import Path
from typing import Any

# Third-party
import numpy as np

# Local
from thesis.core.logging import get_logger
from thesis.core.config import ConfigManager, PipelineConfig

Type Hints and Documentation#

Always Include Type Hints#

def process_image(
    image_path: Path,
    parameters: dict[str, Any],
    output_dir: Path | None = None,
) -> np.ndarray:
    """Process medical image with specified parameters.
    
    Args:
        image_path: Path to input image file
        parameters: Dictionary of processing parameters
        output_dir: Optional output directory
        
    Returns:
        Processed image array
        
    Raises:
        FileNotFoundError: If image file doesn't exist
        ValueError: If parameters are invalid
    """
    if not image_path.exists():
        raise FileNotFoundError(f"Image not found: {image_path}")
    
    logger.info(f"Processing {image_path}")
    # Implementation...
    return result

Docstring Format#

Use Google-style docstrings for all public functions and classes:

Summary line (one-line description)
Blank line
Longer description (if needed)
Args section
Returns section
Raises section (if applicable)

Logging#

Use Logging, Not Print#

from thesis.core.logging import get_logger

logger = get_logger(__name__)

# ✅ Good
logger.info(f"Processing patient {patient_id}")
logger.debug(f"Image shape: {image.shape}")
logger.error(f"Failed to load image: {e}")

# ❌ Avoid
print(f"Processing patient {patient_id}")
print(f"Error: {e}")

Log Levels#

DEBUG: Detailed info for developers
INFO: Confirmation that things work as expected
WARNING: Something unexpected but not critical
ERROR: Something failed, but application continues
CRITICAL: Application cannot continue

Testing#

Write Tests for Public APIs#

Use classes to group related tests. Raise specific exception types — ValidationError for Pydantic models, custom exceptions for domain errors — never the bare Exception base class.

# tests/unit/test_config_validators.py
import pytest
from pydantic import ValidationError

from thesis.core.config.validators import TractographyConfig

class TestTractographyConfig:
    def test_defaults(self):
        cfg = TractographyConfig()
        assert cfg.n_samples == 5000
        assert cfg.method == "probtrackx2"

    def test_invalid_method_raises(self):
        with pytest.raises(ValidationError):
            TractographyConfig(method="invalid")

    def test_custom_values(self):
        cfg = TractographyConfig(n_samples=1000, n_steps=500)
        assert cfg.n_samples == 1000

Test Organisation#

One test file per module (e.g. test_config_validators.py covers validators.py)
One test class per class or logical group of functions
Descriptive test names: test_<what>_<condition> (e.g. test_invalid_method_raises)
Use fixtures for repeated setup — prefer pytest built-ins over manual teardown
Mark slow tests: @pytest.mark.slow

Fixtures and Isolation#

Prefer pytest built-in fixtures over manual resource management:

# ✅ Good — tmp_path and monkeypatch handle cleanup automatically
def test_finds_samples_in_primary_dir(self, hcp_config, tmp_path, monkeypatch):
    monkeypatch.chdir(tmp_path)          # isolate working directory
    input_dir = tmp_path / "input"
    input_dir.mkdir()
    result = prepare_hcp_paths(hcp_config, ctx)
    assert result["thsamples"] != []

# ❌ Avoid — manual cleanup is fragile
def test_finds_samples_in_primary_dir(self, hcp_config):
    import tempfile, os
    with tempfile.TemporaryDirectory() as tmpdir:
        orig = os.getcwd()
        try:
            os.chdir(tmpdir)
            ...
        finally:
            os.chdir(orig)

Shared fixtures live in tests/conftest.py:

Fixture	Type	Description
`temp_dir`	`Path`	Temporary directory, auto-cleaned
`mock_config`	`PipelineConfig`	Default test pipeline config
`mock_nifti_data`	`np.ndarray`	3-D float32 array `(64, 64, 64)`
`mock_dwi_data`	`np.ndarray`	4-D float32 array `(64, 64, 64, 32)`
`mock_nifti_file`	`Path`	Saved `.nii.gz` file (requires nibabel)
`mock_patient_data`	`dict`	T1/T2/DWI files + bvals/bvecs
`mock_config_file`	`Path`	YAML file from `mock_config`
`mock_logger`	logger	Loguru logger writing to temp dir

Isolating Global State#

The workflow registry (WORKFLOW_REGISTRY) is a global singleton. Use monkeypatch.setattr to inject a temporary copy instead of mutating it directly:

# ✅ Good — change is automatically reverted after the test
def test_known_workflow_returns_entry(self, monkeypatch):
    entry = WorkflowEntry(name="_test_wf", factory=lambda: None, description="x")
    patched = {**WORKFLOW_REGISTRY._entries, "_test_wf": entry}
    monkeypatch.setattr(WORKFLOW_REGISTRY, "_entries", patched)
    result = _resolve_workflow("_test_wf")
    assert result.name == "_test_wf"

# ❌ Avoid — mutates global state; leaves registry dirty if test fails
def test_known_workflow_returns_entry(self):
    WORKFLOW_REGISTRY.register(entry)
    try:
        result = _resolve_workflow("_test_wf")
        assert result.name == "_test_wf"
    finally:
        WORKFLOW_REGISTRY._entries.pop("_test_wf", None)

Logging in Tests#

The project uses loguru, which is not captured by pytest’s built-in caplog fixture. Do not write tests that assert on log output. Test observable behaviour (return values, raised exceptions, file side-effects) instead.

Error Handling#

Use the Project Exception Hierarchy#

All custom exceptions live in thesis.core.exceptions and inherit from ThesisError. Reuse the existing types (ConfigurationError, ValidationError, ProcessingError and its subclasses RegistrationError, SegmentationError, TractographyError, PipelineError, plus FileIOError and DependencyError) rather than defining new base classes:

from thesis.core.exceptions import ProcessingError, TractographyError

# Raise the most specific type that fits the failure.
if streamline_count == 0:
    raise TractographyError(f"tckgen produced 0 streamlines for {patient_id}")

Add a new subclass only when an existing one does not fit — and inherit from the closest ThesisError descendant (usually ProcessingError):

# thesis/core/exceptions.py
class InvalidImageError(ProcessingError):
    """Raised when image format is invalid."""

Use Specific Exceptions#

# ✅ Good - specific
if not image_path.exists():
    raise FileNotFoundError(f"Image not found: {image_path}")

if image.shape != expected_shape:
    raise InvalidImageError(f"Unexpected shape: {image.shape}")

# ❌ Avoid - too generic
if not image_path.exists():
    raise Exception("Error")

try:
    process_image(image)
except Exception as e:  # Too broad
    logger.error(str(e))

Configuration Best Practices#

1. Externalize Parameters#

# ✅ Good - parameters from config
def process(self, n_threads: int, max_memory: int):
    # Uses provided parameters

# ❌ Avoid - hardcoded values
def process(self):
    n_threads = 8
    max_memory = 16 * 1024

2. Validate Configuration Early#

from pydantic import BaseModel, Field

class ProcessingConfig(BaseModel):
    """Processing configuration with validation."""
    
    n_threads: int = Field(gt=0, le=128)
    timeout_seconds: int = Field(gt=0)
    output_dir: Path

# Validation happens on construction
config = ProcessingConfig(n_threads=-1)  # Raises ValueError

Code Quality#

Use Type Checking#

# Check types
mypy src/thesis

# Format code
black src/thesis tests/

# Sort imports
isort src/thesis tests/

# Lint
flake8 src/thesis tests/

Pre-commit Hooks#

Enable automatic checks:

pre-commit install

Performance Considerations#

1. Use Logging Efficiently#

# ❌ Avoid - always evaluates, even when DEBUG is hidden
logger.debug(f"Result: {expensive_function()}")

# ✅ Better - compute only when you actually need the value
if should_log_debug_details:
    logger.debug(f"Result: {expensive_function()}")

2. Profile Before Optimizing#

import cProfile
import pstats

profiler = cProfile.Profile()
profiler.enable()
# Run code
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative').print_stats(10)

Documentation#

Docstring Requirements#

All public functions and classes
Complex private methods
Non-obvious algorithms or parameters
Example usage for important functions

Example with Usage#

def apply_affine_transform(
    image: np.ndarray,
    affine_matrix: np.ndarray,
) -> np.ndarray:
    """Apply affine transformation to image.
    
    Applies the given 3D affine transformation matrix to the input image
    using trilinear interpolation.
    
    Args:
        image: Input image array (3D)
        affine_matrix: 4x4 affine transformation matrix
        
    Returns:
        Transformed image array
        
    Example:
        >>> image = load_image("T1.nii.gz")
        >>> transform = np.eye(4)
        >>> transform[:3, 3] = [5, 10, 2]  # Translation
        >>> transformed = apply_affine_transform(image, transform)
    """
    # Implementation...
    pass

Git Workflow#

Commit Messages#

Use Conventional Commits:

<type>(<scope>): brief description

Longer explanation (72 chars per line max)

- Specific change 1
- Specific change 2

Closes #123

Types: feat: fix: docs: test: refactor: perf: chore: ci:. See docs/guides/contributing.md for the full commit and version-bump workflow (make bump).

Before Committing#

# Run tests
pytest tests/

# Format and lint
black src/thesis tests/
isort src/thesis tests/
flake8 src/thesis tests/

# Type check
mypy src/thesis

# Or run all quality checks at once
make check

# Commit (stage explicit paths, not `git add .`)
git add src/thesis/workflows/registration/workflow.py tests/unit/test_registration.py
git commit -m "feat(registration): add new registration step"