Grid Workflow#
- class datacube.api.GridWorkflow(index, grid_spec=None, product=None)[source]#
GridWorkflow deals with cell- and tile-based processing using a grid defining a projection and resolution.
Use GridWorkflow to specify your desired output grid. The methods
list_cells()
andlist_tiles()
query the index and return a dictionary of cell or tile keys, each mapping to aTile
object.The
Tile
object can then be used to load the data without needing the index, and can be serialized for use with the distributed package.Create a grid workflow tool.
Either grid_spec or product must be supplied.
- Parameters
- Members
Methods:
cell_observations
([cell_index, geopolygon, ...])List datasets, grouped by cell.
group_into_cells
(observations, group_by)Group observations into a stack of source tiles.
list_cells
([cell_index])List cells that match the query.
list_tiles
([cell_index])List tiles of data, sorted by cell.
load
(tile[, measurements, dask_chunks, ...])Load data for a cell/tile.
tile_sources
(observations, group_by)Split observations into tiles and group into source tiles
- cell_observations(cell_index=None, geopolygon=None, tile_buffer=None, **indexers)[source]#
List datasets, grouped by cell.
- Parameters
geopolygon (datacube.utils.Geometry) – Only return observations with data inside polygon.
tile_buffer ((float,float)) – buffer tiles by (y, x) in CRS units
indexers – Query to match the datasets, see
datacube.api.query.Query
- Returns
Datsets grouped by cell index
- Return type
dict[(int,int), list[
datacube.model.Dataset
]]
- static group_into_cells(observations, group_by)[source]#
Group observations into a stack of source tiles.
- Parameters
observations – datasets grouped by cell index, like from
cell_observations()
group_by (
datacube.api.query.GroupBy
) – grouping method, as returned bydatacube.api.query.query_group_by()
- Returns
tiles grouped by cell index
- Return type
dict[(int,int),
Tile
]
- list_cells(cell_index=None, **query)[source]#
List cells that match the query.
Returns a dictionary of cell indexes to
Tile
objects.Cells are included if they contain any datasets that match the query using the same format as
datacube.Datacube.load()
.E.g.:
gw.list_cells(product='ls5_nbar_albers', time=('2001-1-1 00:00:00', '2001-3-31 23:59:59'))
- Parameters
query – see
datacube.api.query.Query
- Return type
dict[(int, int),
Tile
]
- list_tiles(cell_index=None, **query)[source]#
List tiles of data, sorted by cell.
tiles = gw.list_tiles(product='ls5_nbar_albers', time=('2001-1-1 00:00:00', '2001-3-31 23:59:59'))
The values can be passed to
load()
- Parameters
cell_index ((int,int)) – The cell index (optional). E.g. (14, -40)
query – see
datacube.api.query.Query
- Return type
dict[(int, int, numpy.datetime64),
Tile
]
See also
- static load(tile, measurements=None, dask_chunks=None, fuse_func=None, resampling=None, skip_broken_datasets=False)[source]#
Load data for a cell/tile.
The data to be loaded is defined by the output of
list_tiles()
.This is a static function and does not use the index. This can be useful when running as a worker in a distributed environment and you wish to minimize database connections.
See the documentation on using xarray with dask for more information.
- Parameters
tile (.Tile) – The tile to load.
measurements (list(str)) – The names of measurements to load
dask_chunks (dict) –
If the data should be loaded as needed using
dask.array.Array
, specify the chunk size in each output direction.See the documentation on using xarray with dask for more information.
fuse_func – Function to fuse together a tile that has been pre-grouped by calling
list_cells()
with agroup_by
parameter.The resampling method to use if re-projection is required, could be configured per band using a dictionary (:meth: load_data)
Valid values are:
'nearest', 'cubic', 'bilinear', 'cubic_spline', 'lanczos', 'average'
Defaults to
'nearest'
.skip_broken_datasets (bool) – If True, ignore broken datasets and continue processing with the data that can be loaded. If False, an exception will be raised on a broken dataset. Defaults to False.
- Return type
See also
- static tile_sources(observations, group_by)[source]#
Split observations into tiles and group into source tiles
- Parameters
observations – datasets grouped by cell index, like from
cell_observations()
group_by (
datacube.api.query.GroupBy
) – grouping method, as returned bydatacube.api.query.query_group_by()
- Returns
tiles grouped by cell index and time
- Return type
dict[tuple(int, int, numpy.datetime64),
Tile
]