The Data Cube is a system designed to:
- Catalogue large amounts of Earth Observation data
- Provide a Python based API for high performance querying and data access
- Give scientists and other users easy ability to perform Exploratory Data Analysis
- Allow scalable continent scale processing of the stored data
- Track the provenance of all the contained data to allow for quality control and updates
If you’re reading this, hopefully someone has already set up and loaded data into a Data Cube for you.
Check out the Installation for instructions on configuring and setting up
Types of Datasets in a Data Cube¶
When using the Data Cube, it will contain records about 3 different types of products and datasets.
|Type of dataset||In Index||Data available||Typical data|
|Referenced||Yes||No||Historic or provenance record|
|Ingested||Yes||Yes||Created within the Data Cube|
The existence and metadata of these datasets is known but the data itself is not accessible to the Data Cube. ie. A dataset without a location.
These usually come from the provenance / source information of other datasets.
- Raw Landsat Telemetry
Data is available (has a file location or uri), with associated metadata available in a format understood by the Data Cube.
- USGS Landsat Scenes with prepared
- GA Landsat Scenes
Data has been created by/and is managed by the Data Cube. The data has typically been been copied, compressed, tiled and possibly re-projected into a shape suitable for analysis, and stored in NetCDF4 files.
- Tiled GA Landsat Data, ingested into Australian Albers Equal Area Projection (EPSG:3577) and stored in 100km tiles in NetCDF4