API reference

For examples on how to use the API, check out the example Jupyter notebooks

Datacube Class

Datacube([index, config, app, env, …]) Interface to search, read and write a datacube.

Data Discovery

Datacube.list_products List products in the datacube
Datacube.list_measurements List measurements for each product

Data Loading

Datacube.load Load data as an xarray object.

Internal Loading Functions

This operations can be useful if you need to customise the loading process, for example, to pre-filter the available datasets before loading.

Datacube.find_datasets(**search_terms) Search the index and return all datasets for a product matching the search terms.
Datacube.group_datasets(datasets, group_by) Group datasets along defined non-spatial dimensions (ie.
Datacube.load_data(sources, geobox, measurements) Load data from group_datasets() into an xarray.Dataset.
Datacube.measurement_data(sources, geobox, …) Retrieve a single measurement variable as a xarray.DataArray.

Grid Processing API

Tile(sources, geobox) The Tile object holds a lightweight representation of a datacube result.
GridWorkflow(index[, grid_spec, product]) GridWorkflow deals with cell- and tile-based processing using a grid defining a projection and resolution.
GridWorkflow.list_cells([cell_index]) List cells that match the query.
GridWorkflow.list_tiles([cell_index]) List tiles of data, sorted by cell.
GridWorkflow.load(tile[, measurements, …]) Load data for a cell/tile.

Grid Processing API Internals

GridWorkflow.cell_observations([cell_index, …]) List datasets, grouped by cell.
GridWorkflow.group_into_cells(observations, …) Group observations into a stack of source tiles.
GridWorkflow.tile_sources(observations, group_by) Split observations into tiles and group into source tiles

Internal Data Model

Dataset(type_, metadata_doc, local_uri, …) A Dataset.
Measurement(**kwargs) Describes a single data variable of a Product or Dataset.
MetadataType(definition, Any], …) Metadata Type definition
DatasetType(metadata_type, definition, Any], id_) Product definition
GridSpec(crs, tile_size, float], resolution, …) Definition for a regular spatial grid
Range(begin, end)

Database Index API

Dataset Querying

When connected to an ODC Database, these methods are available for searching and querying:

dc = Datacube()
dc.index.datasets.{method}
get Get dataset by id
search Perform a search, returning results as Dataset objects.
search_by_metadata Perform a search using arbitrary metadata, returning results as Dataset objects.
search_by_product Perform a search, returning datasets grouped by product type.
search_eager Perform a search, returning results as Dataset objects.
search_product_duplicates Find dataset ids who have duplicates of the given set of field names.
search_returning Perform a search, returning only the specified fields.
search_summaries Perform a search, returning just the search fields of each dataset.
has Have we already indexed this dataset?
bulk_has Like has but operates on a list of ids.
can_update Check if dataset can be updated.
count Perform a search, returning count of results.
count_by_product Perform a search, returning a count of for each matching product type.
count_by_product_through_time Perform a search, returning counts for each product grouped in time slices of the given period.
count_product_through_time Perform a search, returning counts for a single product grouped in time slices of the given period.
get_derived Get all derived datasets
get_field_names Get the list of possible search fields for a Product
get_locations Get the list of storage locations for the given dataset id
get_archived_locations Find locations which have been archived for a dataset
get_datasets_for_location Find datasets that exist at the given URI

Dataset Writing

When connected to an ODC Database, these methods are available for adding, updating and archiving datasets:

dc = Datacube()
dc.index.datasets.{method}
add Add dataset to the index.
add_location Add a location to the dataset if it doesn’t already exist.
archive Mark datasets as archived
archive_location Archive a location of the dataset if it exists.
remove_location Remove a location from the dataset if it exists.
restore Mark datasets as not archived
restore_location Un-archive a location of the dataset if it exists.
update Update dataset metadata and location :param Dataset dataset: Dataset to update :param updates_allowed: Allowed updates :rtype: Dataset

Product Querying

When connected to an ODC Database, these methods are available for discovering information about Products:

dc = Datacube()
dc.index.products.{method}
from_doc Create a Product from its definitions
add Add a Product.
can_update Check if product can be updated.
add_document Add a Product using its definition
get Retrieve Product by id
get_by_name Retrieve Product by name
get_unsafe
get_by_name_unsafe
get_with_fields Return dataset types that have all the given fields.
search Return dataset types that have all the given fields.
search_robust Return dataset types that match match-able fields and dict of remaining un-matchable fields.
get_all Retrieve all Products

Product Addition/Modification

When connected to an ODC Database, these methods are available for discovering information about Products:

dc = Datacube()
dc.index.products.{method}
from_doc Create a Product from its definitions
add Add a Product.
update Update a product.
update_document Update a Product using its definition
add_document Add a Product using its definition

Database Index Connections

index.index_connect Create a Data Cube Index that can connect to a PostgreSQL server
index.Index Access to the datacube index.

Dataset to Product Matching

Doc2Dataset Used for constructing Dataset objects from plain metadata documents.

Geometry Utilities

Open Data Cube includes a set of CRS aware geometry utilities.

Geometry Classes

utils.geometry.BoundingBox Bounding box, defining extent in cartesian coordinates.
utils.geometry.CRS Wrapper around osr.SpatialReference providing a more pythonic interface
utils.geometry.Geometry 2D Geometry with CRS
utils.geometry.GeoBox Defines the location and resolution of a rectangular grid of data, including it’s CRS.
utils.geometry.gbox.GeoboxTiles Partition GeoBox into sub geoboxes
model.GridSpec Definition for a regular spatial grid

Creating Geometries

point(x, y, crs) Create a 2D Point
multipoint(coords, crs) Create a 2D MultiPoint Geometry
line(coords, crs) Create a 2D LineString (Connected set of lines)
multiline(coords, crs) Create a 2D MultiLineString (Multiple disconnected sets of lines)
polygon(outer, crs, *inners) Create a 2D Polygon
multipolygon(coords, crs) Create a 2D MultiPolygon
box(left, bottom, right, top, crs) Create a 2D Box (Polygon)
polygon_from_transform(width, height, …) Create a 2D Polygon from an affine transform

Multi-geometry ops

unary_union(geoms) compute union of multiple (multi)polygons efficiently
unary_intersection(geoms) compute intersection of multiple (multi)polygons

Masking

masking.mask_invalid_data(data[, keep_attrs]) Sets all nodata values to nan.
masking.describe_variable_flags(variable[, …]) Returns either a Pandas Dataframe (with_pandas=True - default) or a string (with_pandas=False) describing the available flags for a masking variable
masking.make_mask(variable, **flags) Returns a mask array, based on provided flags

Query Class

Query([index, product, geopolygon, like])

User Configuration

LocalConfig(config[, files_loaded, env]) System configuration for the user.
DEFAULT_CONF_PATHS Config locations in order.

Everything Else

For Exploratory Data Analysis see Datacube Class for more details

For Writing Large Scale Workflows see Grid Processing API for more details