Pipeline Logging¶
This document describes the logging system for the pipeline.
Logger Design¶
- Implemented in
processors/logger.py. - Supports:
- Console output
- Unified run log file
- Optional per-module log files
Unified Run Log¶
- Named as
<timestamp>_<pipeline>.log - Example:
2025-08-28_12-22-40_au_policy_run.log
Module Logs¶
- Each module (scraper, processor) may also have its own log file.
- Controlled by
--no-module-logsflag inpipeline_runner.py.
Levels¶
INFO→ normal progress (scraping, processing, tagging)WARNING→ recoverable issues (year not found, missing tag matches)ERROR→ non-recoverable (download failure, unreadable PDF)
Best Practices¶
- Always run from project root so log paths resolve correctly.
- Logs are stored in
logs/. - Do not commit logs to Git (ignored via
.gitignore).