API reference

For examples on how to use the API, check out the example Jupyter notebooks

Datacube Class

Datacube([index, config, app, env, …])

Interface to search, read and write a datacube.

Data Discovery

Datacube.list_products

List products in the datacube

Datacube.list_measurements

List measurements for each product

Data Loading

Datacube.load

Load data as an xarray object.

Internal Loading Functions

This operations can be useful if you need to customise the loading process, for example, to pre-filter the available datasets before loading.

Datacube.find_datasets(**search_terms)

Search the index and return all datasets for a product matching the search terms.

Datacube.group_datasets(datasets, group_by)

Group datasets along defined non-spatial dimensions (ie.

Datacube.load_data(sources, geobox, measurements)

Load data from group_datasets() into an xarray.Dataset.

Grid Processing API

Tile(sources, geobox)

The Tile object holds a lightweight representation of a datacube result.

GridWorkflow(index[, grid_spec, product])

GridWorkflow deals with cell- and tile-based processing using a grid defining a projection and resolution.

GridWorkflow.list_cells([cell_index])

List cells that match the query.

GridWorkflow.list_tiles([cell_index])

List tiles of data, sorted by cell.

GridWorkflow.load(tile[, measurements, …])

Load data for a cell/tile.

Grid Processing API Internals

GridWorkflow.cell_observations([cell_index, …])

List datasets, grouped by cell.

GridWorkflow.group_into_cells(observations, …)

Group observations into a stack of source tiles.

GridWorkflow.tile_sources(observations, group_by)

Split observations into tiles and group into source tiles

Internal Data Model

Dataset(type_, metadata_doc[, uris, …])

A Dataset.

Measurement(**kwargs)

Describes a single data variable of a Product or Dataset.

MetadataType(definition, dataset_search_fields)

Metadata Type definition

DatasetType(metadata_type, definition[, id_])

Product definition

GridSpec(crs, tile_size, resolution[, origin])

Definition for a regular spatial grid

Range(begin, end)

Database Index API

Dataset Querying

When connected to an ODC Database, these methods are available for searching and querying:

dc = Datacube()
dc.index.datasets.{method}

get

Get dataset by id

search

Perform a search, returning results as Dataset objects.

search_by_metadata

Perform a search using arbitrary metadata, returning results as Dataset objects.

search_by_product

Perform a search, returning datasets grouped by product type.

search_eager

Perform a search, returning results as Dataset objects.

search_product_duplicates

Find dataset ids who have duplicates of the given set of field names.

search_returning

Perform a search, returning only the specified fields.

search_summaries

Perform a search, returning just the search fields of each dataset.

has

Have we already indexed this dataset?

bulk_has

Like has but operates on a list of ids.

can_update

Check if dataset can be updated.

count

Perform a search, returning count of results.

count_by_product

Perform a search, returning a count of for each matching product type.

count_by_product_through_time

Perform a search, returning counts for each product grouped in time slices of the given period.

count_product_through_time

Perform a search, returning counts for a single product grouped in time slices of the given period.

get_derived

Get all derived datasets

get_field_names

Get the list of possible search fields for a Product

get_locations

Get the list of storage locations for the given dataset id

get_archived_locations

Find locations which have been archived for a dataset

get_datasets_for_location

Find datasets that exist at the given URI

Dataset Writing

When connected to an ODC Database, these methods are available for adding, updating and archiving datasets:

dc = Datacube()
dc.index.datasets.{method}

add

Add dataset to the index.

add_location

Add a location to the dataset if it doesn’t already exist.

archive

Mark datasets as archived

archive_location

Archive a location of the dataset if it exists.

remove_location

Remove a location from the dataset if it exists.

restore

Mark datasets as not archived

restore_location

Un-archive a location of the dataset if it exists.

update

Update dataset metadata and location :param Dataset dataset: Dataset to update :param updates_allowed: Allowed updates :rtype: Dataset

Product Querying

When connected to an ODC Database, these methods are available for discovering information about Products:

dc = Datacube()
dc.index.products.{method}

from_doc

Create a Product from its definitions

add

Add a Product.

can_update

Check if product can be updated.

add_document

Add a Product using its definition

get

Retrieve Product by id

get_by_name

Retrieve Product by name

get_unsafe

get_by_name_unsafe

get_with_fields

Return dataset types that have all the given fields.

search

Return dataset types that have all the given fields.

search_robust

Return dataset types that match match-able fields and dict of remaining un-matchable fields.

get_all

Retrieve all Products

Product Addition/Modification

When connected to an ODC Database, these methods are available for discovering information about Products:

dc = Datacube()
dc.index.products.{method}

from_doc

Create a Product from its definitions

add

Add a Product.

update

Update a product.

update_document

Update a Product using its definition

add_document

Add a Product using its definition

Database Index Connections

index.index_connect

Create a Data Cube Index that can connect to a PostgreSQL server

index.Index

Access to the datacube index.

Dataset to Product Matching

Doc2Dataset

Used for constructing Dataset objects from plain metadata documents.

Geometry Utilities

Open Data Cube includes a set of CRS aware geometry utilities.

Geometry Classes

utils.geometry.BoundingBox

Bounding box, defining extent in cartesian coordinates.

utils.geometry.CRS

Wrapper around pyproj.CRS for backwards compatibility.

utils.geometry.Geometry

2D Geometry with CRS

utils.geometry.GeoBox

Defines the location and resolution of a rectangular grid of data, including it’s CRS.

utils.geometry.gbox.GeoboxTiles

Partition GeoBox into sub geoboxes

model.GridSpec

Definition for a regular spatial grid

Creating Geometries

point(x, y, crs)

Create a 2D Point

multipoint(coords, crs)

Create a 2D MultiPoint Geometry

line(coords, crs)

Create a 2D LineString (Connected set of lines)

multiline(coords, crs)

Create a 2D MultiLineString (Multiple disconnected sets of lines)

polygon(outer, crs, *inners)

Create a 2D Polygon

multipolygon(coords, crs)

Create a 2D MultiPolygon

box(left, bottom, right, top, crs)

Create a 2D Box (Polygon)

polygon_from_transform(width, height, …)

Create a 2D Polygon from an affine transform

Multi-geometry ops

unary_union(geoms)

compute union of multiple (multi)polygons efficiently

unary_intersection(geoms)

compute intersection of multiple (multi)polygons

Masking

masking.mask_invalid_data(data[, keep_attrs])

Sets all nodata values to nan.

masking.describe_variable_flags(variable[, …])

Returns either a Pandas Dataframe (with_pandas=True - default) or a string (with_pandas=False) describing the available flags for a masking variable

masking.make_mask(variable, **flags)

Returns a mask array, based on provided flags

Query Class

Query([index, product, geopolygon, like])

User Configuration

LocalConfig(config[, files_loaded, env])

System configuration for the user.

DEFAULT_CONF_PATHS

Config locations in order.

Everything Else

For Exploratory Data Analysis see Datacube Class for more details

For Writing Large Scale Workflows see Grid Processing API for more details