Best Practices for This Framework#
Code Organization#
1. Module Structure#
module_name/
├── __init__.py # Public API
├── module.py # Main implementation
├── utils.py # Helper functions
└── exceptions.py # Custom exceptions
2. Naming Conventions#
Classes:
PascalCase(e.g.,PreprocessingStep)Functions:
snake_case(e.g.,execute_step)Constants:
UPPER_SNAKE_CASE(e.g.,MAX_THREADS)Private: Leading underscore (e.g.,
_internal_method)Modules:
snake_case(e.g.,preprocessing.py)
3. Imports#
# Standard library
import sys
import json
from pathlib import Path
from typing import Any
# Third-party
import numpy as np
# Local
from thesis.core.logging import get_logger
from thesis.core.config import ConfigManager, PipelineConfig
Type Hints and Documentation#
Always Include Type Hints#
def process_image(
image_path: Path,
parameters: dict[str, Any],
output_dir: Path | None = None,
) -> np.ndarray:
"""Process medical image with specified parameters.
Args:
image_path: Path to input image file
parameters: Dictionary of processing parameters
output_dir: Optional output directory
Returns:
Processed image array
Raises:
FileNotFoundError: If image file doesn't exist
ValueError: If parameters are invalid
"""
if not image_path.exists():
raise FileNotFoundError(f"Image not found: {image_path}")
logger.info(f"Processing {image_path}")
# Implementation...
return result
Docstring Format#
Use Google-style docstrings for all public functions and classes:
Summary line (one-line description)
Blank line
Longer description (if needed)
Args section
Returns section
Raises section (if applicable)
Logging#
Use Logging, Not Print#
from thesis.core.logging import get_logger
logger = get_logger(__name__)
# ✅ Good
logger.info(f"Processing patient {patient_id}")
logger.debug(f"Image shape: {image.shape}")
logger.error(f"Failed to load image: {e}")
# ❌ Avoid
print(f"Processing patient {patient_id}")
print(f"Error: {e}")
Log Levels#
DEBUG: Detailed info for developers
INFO: Confirmation that things work as expected
WARNING: Something unexpected but not critical
ERROR: Something failed, but application continues
CRITICAL: Application cannot continue
Testing#
Write Tests for Public APIs#
Use classes to group related tests. Raise specific exception types — ValidationError for
Pydantic models, custom exceptions for domain errors — never the bare Exception base class.
# tests/unit/test_config_validators.py
import pytest
from pydantic import ValidationError
from thesis.core.config.validators import TractographyConfig
class TestTractographyConfig:
def test_defaults(self):
cfg = TractographyConfig()
assert cfg.n_samples == 5000
assert cfg.method == "probtrackx2"
def test_invalid_method_raises(self):
with pytest.raises(ValidationError):
TractographyConfig(method="invalid")
def test_custom_values(self):
cfg = TractographyConfig(n_samples=1000, n_steps=500)
assert cfg.n_samples == 1000
Test Organisation#
One test file per module (e.g.
test_config_validators.pycoversvalidators.py)One test class per class or logical group of functions
Descriptive test names:
test_<what>_<condition>(e.g.test_invalid_method_raises)Use fixtures for repeated setup — prefer pytest built-ins over manual teardown
Mark slow tests:
@pytest.mark.slow
Fixtures and Isolation#
Prefer pytest built-in fixtures over manual resource management:
# ✅ Good — tmp_path and monkeypatch handle cleanup automatically
def test_finds_samples_in_primary_dir(self, hcp_config, tmp_path, monkeypatch):
monkeypatch.chdir(tmp_path) # isolate working directory
input_dir = tmp_path / "input"
input_dir.mkdir()
result = prepare_hcp_paths(hcp_config, ctx)
assert result["thsamples"] != []
# ❌ Avoid — manual cleanup is fragile
def test_finds_samples_in_primary_dir(self, hcp_config):
import tempfile, os
with tempfile.TemporaryDirectory() as tmpdir:
orig = os.getcwd()
try:
os.chdir(tmpdir)
...
finally:
os.chdir(orig)
Shared fixtures live in tests/conftest.py:
Fixture |
Type |
Description |
|---|---|---|
|
|
Temporary directory, auto-cleaned |
|
|
Default test pipeline config |
|
|
3-D float32 array |
|
|
4-D float32 array |
|
|
Saved |
|
|
T1/T2/DWI files + bvals/bvecs |
|
|
YAML file from |
|
logger |
Loguru logger writing to temp dir |
Isolating Global State#
The workflow registry (WORKFLOW_REGISTRY) is a global singleton. Use monkeypatch.setattr
to inject a temporary copy instead of mutating it directly:
# ✅ Good — change is automatically reverted after the test
def test_known_workflow_returns_entry(self, monkeypatch):
entry = WorkflowEntry(name="_test_wf", factory=lambda: None, description="x")
patched = {**WORKFLOW_REGISTRY._entries, "_test_wf": entry}
monkeypatch.setattr(WORKFLOW_REGISTRY, "_entries", patched)
result = _resolve_workflow("_test_wf")
assert result.name == "_test_wf"
# ❌ Avoid — mutates global state; leaves registry dirty if test fails
def test_known_workflow_returns_entry(self):
WORKFLOW_REGISTRY.register(entry)
try:
result = _resolve_workflow("_test_wf")
assert result.name == "_test_wf"
finally:
WORKFLOW_REGISTRY._entries.pop("_test_wf", None)
Logging in Tests#
The project uses loguru, which is not captured by pytest’s built-in caplog fixture.
Do not write tests that assert on log output. Test observable behaviour (return values,
raised exceptions, file side-effects) instead.
Error Handling#
Use the Project Exception Hierarchy#
All custom exceptions live in thesis.core.exceptions and inherit from
ThesisError. Reuse the existing types (ConfigurationError, ValidationError,
ProcessingError and its subclasses RegistrationError, SegmentationError,
TractographyError, PipelineError, plus FileIOError and DependencyError)
rather than defining new base classes:
from thesis.core.exceptions import ProcessingError, TractographyError
# Raise the most specific type that fits the failure.
if streamline_count == 0:
raise TractographyError(f"tckgen produced 0 streamlines for {patient_id}")
Add a new subclass only when an existing one does not fit — and inherit from the
closest ThesisError descendant (usually ProcessingError):
# thesis/core/exceptions.py
class InvalidImageError(ProcessingError):
"""Raised when image format is invalid."""
Use Specific Exceptions#
# ✅ Good - specific
if not image_path.exists():
raise FileNotFoundError(f"Image not found: {image_path}")
if image.shape != expected_shape:
raise InvalidImageError(f"Unexpected shape: {image.shape}")
# ❌ Avoid - too generic
if not image_path.exists():
raise Exception("Error")
try:
process_image(image)
except Exception as e: # Too broad
logger.error(str(e))
Configuration Best Practices#
1. Externalize Parameters#
# ✅ Good - parameters from config
def process(self, n_threads: int, max_memory: int):
# Uses provided parameters
# ❌ Avoid - hardcoded values
def process(self):
n_threads = 8
max_memory = 16 * 1024
2. Validate Configuration Early#
from pydantic import BaseModel, Field
class ProcessingConfig(BaseModel):
"""Processing configuration with validation."""
n_threads: int = Field(gt=0, le=128)
timeout_seconds: int = Field(gt=0)
output_dir: Path
# Validation happens on construction
config = ProcessingConfig(n_threads=-1) # Raises ValueError
Code Quality#
Use Type Checking#
# Check types
mypy src/thesis
# Format code
black src/thesis tests/
# Sort imports
isort src/thesis tests/
# Lint
flake8 src/thesis tests/
Pre-commit Hooks#
Enable automatic checks:
pre-commit install
Performance Considerations#
1. Use Logging Efficiently#
# ❌ Avoid - always evaluates, even when DEBUG is hidden
logger.debug(f"Result: {expensive_function()}")
# ✅ Better - compute only when you actually need the value
if should_log_debug_details:
logger.debug(f"Result: {expensive_function()}")
2. Profile Before Optimizing#
import cProfile
import pstats
profiler = cProfile.Profile()
profiler.enable()
# Run code
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative').print_stats(10)
Documentation#
Docstring Requirements#
All public functions and classes
Complex private methods
Non-obvious algorithms or parameters
Example usage for important functions
Example with Usage#
def apply_affine_transform(
image: np.ndarray,
affine_matrix: np.ndarray,
) -> np.ndarray:
"""Apply affine transformation to image.
Applies the given 3D affine transformation matrix to the input image
using trilinear interpolation.
Args:
image: Input image array (3D)
affine_matrix: 4x4 affine transformation matrix
Returns:
Transformed image array
Example:
>>> image = load_image("T1.nii.gz")
>>> transform = np.eye(4)
>>> transform[:3, 3] = [5, 10, 2] # Translation
>>> transformed = apply_affine_transform(image, transform)
"""
# Implementation...
pass
Git Workflow#
Commit Messages#
Use Conventional Commits:
<type>(<scope>): brief description
Longer explanation (72 chars per line max)
- Specific change 1
- Specific change 2
Closes #123
Types: feat: fix: docs: test: refactor: perf: chore: ci:. See
docs/guides/contributing.md for the full commit and version-bump workflow
(make bump).
Before Committing#
# Run tests
pytest tests/
# Format and lint
black src/thesis tests/
isort src/thesis tests/
flake8 src/thesis tests/
# Type check
mypy src/thesis
# Or run all quality checks at once
make check
# Commit (stage explicit paths, not `git add .`)
git add src/thesis/workflows/registration/workflow.py tests/unit/test_registration.py
git commit -m "feat(registration): add new registration step"