Data.gov Harvester (datagov-harvester)
The Data.gov harvester service ingests dataset metadata from federal, state, local, and tribal data.json endpoints, CKAN sources, and other CSW/WAF sources on a scheduled cadence and writes records into the central catalog. It is a publisher-facing service (not a public API for consumers) but its source code is open at GSA/datagov-harvester and it documents the harvest source types, job schedules, and validation rules used by Data.gov.