Opera Ingester Task
The OperaIngesterTask downloads raw files from the Opera API or other storage for the configured properties. It is typically used as a step within OperaRawTask.
Responsibilities
- Authenticates against Opera endpoints when necessary.
- Downloads daily snapshots and raw export files into the job’s input cache.
- Writes the downloaded files to a location accessible by the
OperaCrawlerTask.
Configuration
sync_dates— Optional list of dates to download. When omitted, the task downloads full or incremental sets according to job configuration.write_to_catalog— Write the raw files to the catalog if configured.
Related Documentation
Last updated on