Data.gov Harvester (datagov-harvester)

The Data.gov harvester service ingests dataset metadata from federal, state, local, and tribal data.json endpoints, CKAN sources, and other CSW/WAF sources on a scheduled cadence and writes records into the central catalog. It is a publisher-facing service (not a public API for consumers) but its source code is open at GSA/datagov-harvester and it documents the harvest source types, job schedules, and validation rules used by Data.gov.

Data.gov Harvester (datagov-harvester) is one of 4 APIs that Data.gov publishes on the APIs.io network.

Tagged areas include Harvesting, Catalog, and Ingestion.

API entry from apis.yml

apis.yml Raw ↑
aid: data-gov:datajson-harvester
name: Data.gov Harvester (datagov-harvester)
tags:
- Harvesting
- Catalog
- Ingestion
humanURL: https://harvest.data.gov/
properties:
- url: https://github.com/GSA/datagov-harvester
  type: SourceCode
- url: https://harvest.data.gov/
  type: Portal
description: The Data.gov harvester service ingests dataset metadata from federal, state, local, and tribal
  data.json endpoints, CKAN sources, and other CSW/WAF sources on a scheduled cadence and writes records
  into the central catalog. It is a publisher-facing service (not a public API for consumers) but its
  source code is open at GSA/datagov-harvester and it documents the harvest source types, job schedules,
  and validation rules used by Data.gov.