File Naming Standards¶
Consistent file naming helps downstream processing and filtering.
General Rules¶
- Use underscores
_not spaces. - Include year when available:
AU_AI_Strategy_2024.pdf. - Use PascalCase or underscores for multi-word titles.
- Avoid ambiguous suffixes like
final2.
Examples¶
✅ Good:
AU_Digital_Compact_2024.pdfKenya_UPR_Report_2020.pdfNigeria_CRC_Observation_2019.docx
❌ Bad:
digital compact final.pdfKenya report.docdoc1.pdf
Processing Notes¶
- Year extraction depends on
\d{4}patterns. - If no year in filename, pipeline scans document text.
_rawfield in metadata preserves original names.