datacube.Datacube.load¶
-
Datacube.
load
(product=None, measurements=None, output_crs=None, resolution=None, resampling=None, skip_broken_datasets=False, dask_chunks=None, like=None, fuse_func=None, align=None, datasets=None, progress_cbk=None, **query)[source]¶ Load data as an
xarray
object. Each measurement will be a data variable in thexarray.Dataset
.See the xarray documentation for usage of the
xarray.Dataset
andxarray.DataArray
objects.- Product and Measurements
A product can be specified using the product name, or by search fields that uniquely describe a single product.
product='ls5_ndvi_albers'
See
list_products()
for the list of products with their names and properties.A product can also be selected by searching using fields, but must only match one product. For example:
platform='LANDSAT_5', product_type='ndvi'
The
measurements
argument is a list of measurement names, as listed inlist_measurements()
. If not provided, all measurements for the product will be returned.measurements=['red', 'nir', 'swir2']
- Dimensions
Spatial dimensions can specified using the
longitude
/latitude
andx
/y
fields.The CRS of this query is assumed to be WGS84/EPSG:4326 unless the
crs
field is supplied, even if the stored data is in another projection or the output_crs is specified. The dimensionslongitude
/latitude
andx
/y
can be used interchangeably.latitude=(-34.5, -35.2), longitude=(148.3, 148.7)
or
x=(1516200, 1541300), y=(-3867375, -3867350), crs='EPSG:3577'
The
time
dimension can be specified using a tuple of datetime objects or strings with YYYY-MM-DD hh:mm:ss format. E.g:time=('2001-04', '2001-07')
For EO-specific datasets that are based around scenes, the time dimension can be reduced to the day level, using solar day to keep scenes together.
group_by='solar_day'
For data that has different values for the scene overlap the requires more complex rules for combining data, such as GA’s Pixel Quality dataset, a function can be provided to the merging into a single time slice.
See
datacube.helpers.ga_pq_fuser()
for an example implementation.- Output
To reproject or resample the data, supply the
output_crs
,resolution
,resampling
andalign
fields.By default, the resampling method is ‘nearest’. However any stored overview layers may be used when down-sampling, which may override (or hybridise) the choice of resampling method.
To reproject data to 25m resolution for EPSG:3577:
dc.load(product='ls5_nbar_albers', x=(148.15, 148.2), y=(-35.15, -35.2), time=('1990', '1991'), output_crs='EPSG:3577`, resolution=(-25, 25), resampling='cubic')
- Parameters
product (str) – the product to be included.
measurements (list(str), optional) –
Measurements name or list of names to be included, as listed in
list_measurements()
.If a list is specified, the measurements will be returned in the order requested. By default all available measurements are included.
query – Search parameters for products and dimension ranges as described above.
output_crs (str) – The CRS of the returned data. If no CRS is supplied, the CRS of the stored data is used.
A tuple of the spatial resolution of the returned data. This includes the direction (as indicated by a positive or negative number).
Typically when using most CRSs, the first number would be negative.
resampling (str|dict) –
The resampling method to use if re-projection is required. This could be a string or a dictionary mapping band name to resampling mode. When using a dict use
'*'
to indicate “apply to all other bands”, for example{'*': 'cubic', 'fmask': 'nearest'}
would use cubic for all bands exceptfmask
for which nearest will be used.Valid values are:
'nearest', 'cubic', 'bilinear', 'cubic_spline', 'lanczos', 'average', 'mode', 'gauss', 'max', 'min', 'med', 'q1', 'q3'
Default is to use
nearest
for all bands. .. seealso::load_data()
Load data such that point ‘align’ lies on the pixel boundary. Units are in the co-ordinate space of the output CRS.
Default is (0,0)
dask_chunks (dict) –
If the data should be lazily loaded using
dask.array.Array
, specify the chunking size in each output dimension.See the documentation on using xarray with dask for more information.
like (xarray.Dataset) –
Uses the output of a previous
load()
to form the basis of a request for another product. E.g.:pq = dc.load(product='ls5_pq_albers', like=nbar_dataset)
group_by (str) – When specified, perform basic combining/reducing of the data.
fuse_func – Function used to fuse/combine/reduce data with the
group_by
parameter. By default, data is simply copied over the top of each other, in a relatively undefined manner. This function can perform a specific combining step, eg. for combining GA PQ data. This can be a dictionary if different fusers are needed per band.datasets – Optional. If this is a non-empty list of
datacube.model.Dataset
objects, these will be loaded instead of performing a database lookup.limit (int) – Optional. If provided, limit the maximum number of datasets returned. Useful for testing and debugging.
progress_cbk – Int, Int -> None if supplied will be called for every file read with files_processed_so_far, total_files. This is only applicable to non-lazy loads, ignored when using dask.
- Returns
Requested data in a
xarray.Dataset
- Return type