First Run Errors Guide¶
This file documents common issues when running the pipeline for the first time.
🟢 Missing Dependencies¶
Error: ModuleNotFoundError: No module named 'requests'
Fix: Install requirements.
🟢 Pandoc / PDF Generation¶
Error: pdflatex not found
Cause: Pandoc tries to use LaTeX to generate PDFs.
Fix: The pipeline does not require Pandoc. Use ReportLab for generating test PDFs.
🟢 Metadata Errors¶
Error: FileNotFoundError: data/metadata/metadata.json
Fix: Run init_project.py to create scaffolding.
🟢 Logging Issues¶
Symptom: No logs appear in logs/
Fix: Ensure pipeline_runner.py is invoked from project root.
🟢 Import Errors¶
Error: ModuleNotFoundError: No module named 'processors'
Fix: Run commands from project root.
🟢 Virtual Environment Not Activated¶
Symptom: Dependencies not found even after pip install -r requirements.txt
Cause: Virtual environment not activated.
Fix: Activate the virtual environment.
🟢 Python Version Mismatch¶
Error: SyntaxError or features not available
Cause: Project requires Python 3.12.
Fix: Check Python version and upgrade if needed.
🟢 Pre-commit Hook Failures¶
Error: black or flake8 failures during commit
Cause: Code doesn't meet formatting standards.
Fix: Run pre-commit to auto-fix issues.
🟢 Test Failures¶
Error: Tests fail with AssertionError or other exceptions
Fix: Run tests with verbose output to see details.
pytest tests/ -v --tb=long
# Run specific failing test
pytest tests/test_validators.py::TestURLValidation::test_validate_url_valid_https -vv
🟢 Scorecard File Not Found¶
Error: FileNotFoundError: data/scorecard/scorecard_main_presentation.xlsx
Cause: Scorecard Excel file missing or in wrong location.
Fix: Ensure canonical scorecard file exists at data/scorecard/scorecard_main_presentation.xlsx.
🟢 Selenium/Browser Driver Issues¶
Error: WebDriverException: chromedriver not found
Cause: Selenium scrapers (_sel variants) need browser drivers.
Fix: Use non-Selenium scrapers or install ChromeDriver.
# Use regular scraper (no Selenium)
python pipeline_runner.py --source au_policy
# Or install ChromeDriver for Selenium scrapers
# See: https://chromedriver.chromium.org/downloads
🟢 Permission Denied Errors¶
Error: PermissionError: [Errno 13] Permission denied
Cause: Insufficient permissions to write files or create directories.
Fix: Check directory permissions.