Skip to content

DigitalChild Documentation

Welcome to the complete documentation for DigitalChild (LittleRainbowRights), a Python pipeline for analyzing human rights documents with focus on child and LGBTQ+ digital protections.

Documentation Sections

Getting Started

Core Guides

API Documentation

Scorecard

Standards & Specifications

Technical Architecture

Project Information

Project Structure

DigitalChild/
├── pipeline_runner.py      # Main entry point
├── scrapers/               # Web scrapers for document sources
├── processors/             # Text extraction and tagging
├── api/                    # Flask REST API (Phase 4)
├── data/
│   ├── raw/               # Downloaded documents
│   ├── processed/         # Extracted text
│   ├── metadata/          # Document metadata with tags
│   └── exports/           # CSV exports for analysis
├── configs/               # Tag configurations and URL dictionaries
├── docs/                  # This documentation
└── tests/                 # Test suite (209 tests)

Key Features

  • Document Pipeline: Scrape → Process → Tag → Enrich → Export
  • REST API: 14 production endpoints with authentication and rate limiting
  • Scorecard System: 10 indicators × 194 countries for digital rights analysis
  • Flexible Tagging: Regex-based tagging with version control
  • Data Quality: Automated validation of 2,543 source URLs
  • Open Source: MIT license for code, CC BY 4.0 for data

Support

Citation

@software{littlerainbowrights2025,
  title = {DigitalChild / LittleRainbowRights: Child and LGBTQ+ Digital Rights Analysis Pipeline},
  author = {Vollmer, D.T. and Vollmer, S.C.},
  year = {2025},
  version = {2.0.0},
  url = {https://github.com/MissCrispenCakes/DigitalChild},
  doi = {10.5281/zenodo.18318098},
  license = {MIT}
}

Version: 2.0.0 Last Updated: January 2026 License: MIT (code) / CC BY 4.0 (data)