Best Practices for This Framework#

Code Organization#

1. Module Structure#

module_name/
├── __init__.py        # Public API
├── module.py          # Main implementation
├── utils.py           # Helper functions
└── exceptions.py      # Custom exceptions

2. Naming Conventions#

  • Classes: PascalCase (e.g., PreprocessingStep)

  • Functions: snake_case (e.g., execute_step)

  • Constants: UPPER_SNAKE_CASE (e.g., MAX_THREADS)

  • Private: Leading underscore (e.g., _internal_method)

  • Modules: snake_case (e.g., preprocessing.py)

3. Imports#

# Standard library
import sys
import json
from pathlib import Path
from typing import Any

# Third-party
import numpy as np

# Local
from thesis.core.logging import get_logger
from thesis.core.config import ConfigManager, PipelineConfig

Type Hints and Documentation#

Always Include Type Hints#

def process_image(
    image_path: Path,
    parameters: dict[str, Any],
    output_dir: Path | None = None,
) -> np.ndarray:
    """Process medical image with specified parameters.
    
    Args:
        image_path: Path to input image file
        parameters: Dictionary of processing parameters
        output_dir: Optional output directory
        
    Returns:
        Processed image array
        
    Raises:
        FileNotFoundError: If image file doesn't exist
        ValueError: If parameters are invalid
    """
    if not image_path.exists():
        raise FileNotFoundError(f"Image not found: {image_path}")
    
    logger.info(f"Processing {image_path}")
    # Implementation...
    return result

Docstring Format#

Use Google-style docstrings for all public functions and classes:

  • Summary line (one-line description)

  • Blank line

  • Longer description (if needed)

  • Args section

  • Returns section

  • Raises section (if applicable)

Logging#

Use Logging, Not Print#

from thesis.core.logging import get_logger

logger = get_logger(__name__)

# ✅ Good
logger.info(f"Processing patient {patient_id}")
logger.debug(f"Image shape: {image.shape}")
logger.error(f"Failed to load image: {e}")

# ❌ Avoid
print(f"Processing patient {patient_id}")
print(f"Error: {e}")

Log Levels#

  • DEBUG: Detailed info for developers

  • INFO: Confirmation that things work as expected

  • WARNING: Something unexpected but not critical

  • ERROR: Something failed, but application continues

  • CRITICAL: Application cannot continue

Testing#

Write Tests for Public APIs#

Use classes to group related tests. Raise specific exception types — ValidationError for Pydantic models, custom exceptions for domain errors — never the bare Exception base class.

# tests/unit/test_config_validators.py
import pytest
from pydantic import ValidationError

from thesis.core.config.validators import TractographyConfig

class TestTractographyConfig:
    def test_defaults(self):
        cfg = TractographyConfig()
        assert cfg.n_samples == 5000
        assert cfg.method == "probtrackx2"

    def test_invalid_method_raises(self):
        with pytest.raises(ValidationError):
            TractographyConfig(method="invalid")

    def test_custom_values(self):
        cfg = TractographyConfig(n_samples=1000, n_steps=500)
        assert cfg.n_samples == 1000

Test Organisation#

  • One test file per module (e.g. test_config_validators.py covers validators.py)

  • One test class per class or logical group of functions

  • Descriptive test names: test_<what>_<condition> (e.g. test_invalid_method_raises)

  • Use fixtures for repeated setup — prefer pytest built-ins over manual teardown

  • Mark slow tests: @pytest.mark.slow

Fixtures and Isolation#

Prefer pytest built-in fixtures over manual resource management:

# ✅ Good — tmp_path and monkeypatch handle cleanup automatically
def test_finds_samples_in_primary_dir(self, hcp_config, tmp_path, monkeypatch):
    monkeypatch.chdir(tmp_path)          # isolate working directory
    input_dir = tmp_path / "input"
    input_dir.mkdir()
    result = prepare_hcp_paths(hcp_config, ctx)
    assert result["thsamples"] != []

# ❌ Avoid — manual cleanup is fragile
def test_finds_samples_in_primary_dir(self, hcp_config):
    import tempfile, os
    with tempfile.TemporaryDirectory() as tmpdir:
        orig = os.getcwd()
        try:
            os.chdir(tmpdir)
            ...
        finally:
            os.chdir(orig)

Shared fixtures live in tests/conftest.py:

Fixture

Type

Description

temp_dir

Path

Temporary directory, auto-cleaned

mock_config

PipelineConfig

Default test pipeline config

mock_nifti_data

np.ndarray

3-D float32 array (64, 64, 64)

mock_dwi_data

np.ndarray

4-D float32 array (64, 64, 64, 32)

mock_nifti_file

Path

Saved .nii.gz file (requires nibabel)

mock_patient_data

dict

T1/T2/DWI files + bvals/bvecs

mock_config_file

Path

YAML file from mock_config

mock_logger

logger

Loguru logger writing to temp dir

Isolating Global State#

The workflow registry (WORKFLOW_REGISTRY) is a global singleton. Use monkeypatch.setattr to inject a temporary copy instead of mutating it directly:

# ✅ Good — change is automatically reverted after the test
def test_known_workflow_returns_entry(self, monkeypatch):
    entry = WorkflowEntry(name="_test_wf", factory=lambda: None, description="x")
    patched = {**WORKFLOW_REGISTRY._entries, "_test_wf": entry}
    monkeypatch.setattr(WORKFLOW_REGISTRY, "_entries", patched)
    result = _resolve_workflow("_test_wf")
    assert result.name == "_test_wf"

# ❌ Avoid — mutates global state; leaves registry dirty if test fails
def test_known_workflow_returns_entry(self):
    WORKFLOW_REGISTRY.register(entry)
    try:
        result = _resolve_workflow("_test_wf")
        assert result.name == "_test_wf"
    finally:
        WORKFLOW_REGISTRY._entries.pop("_test_wf", None)

Logging in Tests#

The project uses loguru, which is not captured by pytest’s built-in caplog fixture. Do not write tests that assert on log output. Test observable behaviour (return values, raised exceptions, file side-effects) instead.

Error Handling#

Use the Project Exception Hierarchy#

All custom exceptions live in thesis.core.exceptions and inherit from ThesisError. Reuse the existing types (ConfigurationError, ValidationError, ProcessingError and its subclasses RegistrationError, SegmentationError, TractographyError, PipelineError, plus FileIOError and DependencyError) rather than defining new base classes:

from thesis.core.exceptions import ProcessingError, TractographyError

# Raise the most specific type that fits the failure.
if streamline_count == 0:
    raise TractographyError(f"tckgen produced 0 streamlines for {patient_id}")

Add a new subclass only when an existing one does not fit — and inherit from the closest ThesisError descendant (usually ProcessingError):

# thesis/core/exceptions.py
class InvalidImageError(ProcessingError):
    """Raised when image format is invalid."""

Use Specific Exceptions#

# ✅ Good - specific
if not image_path.exists():
    raise FileNotFoundError(f"Image not found: {image_path}")

if image.shape != expected_shape:
    raise InvalidImageError(f"Unexpected shape: {image.shape}")

# ❌ Avoid - too generic
if not image_path.exists():
    raise Exception("Error")

try:
    process_image(image)
except Exception as e:  # Too broad
    logger.error(str(e))

Configuration Best Practices#

1. Externalize Parameters#

# ✅ Good - parameters from config
def process(self, n_threads: int, max_memory: int):
    # Uses provided parameters

# ❌ Avoid - hardcoded values
def process(self):
    n_threads = 8
    max_memory = 16 * 1024

2. Validate Configuration Early#

from pydantic import BaseModel, Field

class ProcessingConfig(BaseModel):
    """Processing configuration with validation."""
    
    n_threads: int = Field(gt=0, le=128)
    timeout_seconds: int = Field(gt=0)
    output_dir: Path

# Validation happens on construction
config = ProcessingConfig(n_threads=-1)  # Raises ValueError

Code Quality#

Use Type Checking#

# Check types
mypy src/thesis

# Format code
black src/thesis tests/

# Sort imports
isort src/thesis tests/

# Lint
flake8 src/thesis tests/

Pre-commit Hooks#

Enable automatic checks:

pre-commit install

Performance Considerations#

1. Use Logging Efficiently#

# ❌ Avoid - always evaluates, even when DEBUG is hidden
logger.debug(f"Result: {expensive_function()}")

# ✅ Better - compute only when you actually need the value
if should_log_debug_details:
    logger.debug(f"Result: {expensive_function()}")

2. Profile Before Optimizing#

import cProfile
import pstats

profiler = cProfile.Profile()
profiler.enable()
# Run code
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative').print_stats(10)

Documentation#

Docstring Requirements#

  • All public functions and classes

  • Complex private methods

  • Non-obvious algorithms or parameters

  • Example usage for important functions

Example with Usage#

def apply_affine_transform(
    image: np.ndarray,
    affine_matrix: np.ndarray,
) -> np.ndarray:
    """Apply affine transformation to image.
    
    Applies the given 3D affine transformation matrix to the input image
    using trilinear interpolation.
    
    Args:
        image: Input image array (3D)
        affine_matrix: 4x4 affine transformation matrix
        
    Returns:
        Transformed image array
        
    Example:
        >>> image = load_image("T1.nii.gz")
        >>> transform = np.eye(4)
        >>> transform[:3, 3] = [5, 10, 2]  # Translation
        >>> transformed = apply_affine_transform(image, transform)
    """
    # Implementation...
    pass

Git Workflow#

Commit Messages#

Use Conventional Commits:

<type>(<scope>): brief description

Longer explanation (72 chars per line max)

- Specific change 1
- Specific change 2

Closes #123

Types: feat: fix: docs: test: refactor: perf: chore: ci:. See docs/guides/contributing.md for the full commit and version-bump workflow (make bump).

Before Committing#

# Run tests
pytest tests/

# Format and lint
black src/thesis tests/
isort src/thesis tests/
flake8 src/thesis tests/

# Type check
mypy src/thesis

# Or run all quality checks at once
make check

# Commit (stage explicit paths, not `git add .`)
git add src/thesis/workflows/registration/workflow.py tests/unit/test_registration.py
git commit -m "feat(registration): add new registration step"