Data Replication

Simple Data Cube Replication Tool

This tool provides a very simplistic way to download data and metadata from a remote Data Cube onto a local PC. It connects to a remote Data Cube via SSH, and downloads database records and files.

A configuration file is used to define which portions of which Product should be downloaded. If a Dataset is already available locally, it will not be downlaoded again, meaning the tool can be run multiple times to keep the local system up to date with new datasets on the remote server.

It can be run from the command line as datacube-simple-replica, taking an optional parameter of a configuration file.

Provide a configuration file in ~/.datacube.replication.conf in YAML format, or specify an alternate location on the command line.

Command line documentation

datacube-simple-replica

A Simple Data Cube Replication Tool

Connects to a remote Data Cube via SSH, and downloads database records and files to a local file system and database.

Provide a configuration file in ~/.datacube.replication.conf in YAML format, or specify an alternate location on the command line.

For example, the following config will download 3 PQ products for the specified time and space range. Queries are specified the same as when using the API to search for datasets.

remote_host: raijin.nci.org.auo
remote_user: dra547
db_password: xxxxxxxxxxxx
remote_dir: /g/data/
local_dir: C:/datacube/

replicated_data:
- product: ls5_pq_albers
  crs: EPSG:3577
  x: [1200000, 1300000]
  y: [-4200000, -4300000]
  time: [2008-01-01, 2010-01-01]

- product: ls7_pq_albers
  crs: EPSG:3577
  x: [1200000, 1300000]
  y: [-4200000, -4300000]
  time: [2008-01-01, 2010-01-01]

- product: ls8_pq_albers
  crs: EPSG:3577
  x: [1200000, 1300000]
  y: [-4200000, -4300000]
  time: [2008-01-01, 2010-01-01]
datacube-simple-replica [OPTIONS] [CONFIG_PATH]

Options

--version
-v, --verbose

Use multiple times for more verbosity

--log-file <log_file>

Specify log file

-E, --env <env>
-C, --config, --config_file <config>
--log-queries

Print database queries.

Arguments

CONFIG_PATH

Optional argument

Caveats and limitations

  • Remote datacube files and database are accessed via an SSH host that can be logged into without a password, ie. by using local SSH key agent.

  • The remote datacube index must be same version as the local datacube code.