Dataset Searching & Querying#
Finding Datasets#
Individual datasets for a product can be searched for using a datacube instance’s find_datasets
method.
For example, we could search for an example dataset from the ls9_sr
product:
[1]:
import datacube
dc = datacube.Datacube(app="my_analysis")
datasets = dc.find_datasets(product="ls9_sr", limit=1)
datasets
[1]:
[Dataset <id=d853931f-f37d-5ed0-98a9-20753caf97f8 product=ls9_sr location=s3://deafrica-landsat/collection02/level-2/standard/oli-tirs/2022/177/042/LC09_L2SP_177042_20220304_20220306_02_T1/LC09_L2SP_177042_20220304_20220306_02_T1_SR_stac.json>]
We can also search for datasets within a specific spatial extent or time period. To do this, we supply a spatiotemporal query (i.e. a range of x- and y-coordinates defining the spatial area to load, and a range of times).
dc.find_datasets()
will then return a subset of datasets that match this query:
[2]:
datasets = dc.find_datasets(
product="ls9_sr",
x=(29.0, 29.01),
y=(25.0, 25.01),
time=("2022-01-01", "2022-02-01")
)
datasets
[2]:
[Dataset <id=8a7ae87d-2032-527f-93af-bb6a59c4f972 product=ls9_sr location=s3://deafrica-landsat/collection02/level-2/standard/oli-tirs/2022/177/043/LC09_L2SP_177043_20220131_20220202_02_T1/LC09_L2SP_177043_20220131_20220202_02_T1_SR_stac.json>,
Dataset <id=e83c49c0-a10a-57e4-846b-e07e2ebe1a74 product=ls9_sr location=s3://deafrica-landsat/collection02/level-2/standard/oli-tirs/2022/177/043/LC09_L2SP_177043_20220115_20220118_02_T1/LC09_L2SP_177043_20220115_20220118_02_T1_SR_stac.json>]
Inspecting Datasets#
Dataset objects contain important metadata that are required for loading and interpreting datacube data. These include the dataset’s URIs:
[3]:
datasets[0].uris
[3]:
['s3://deafrica-landsat/collection02/level-2/standard/oli-tirs/2022/177/043/LC09_L2SP_177043_20220131_20220202_02_T1/LC09_L2SP_177043_20220131_20220202_02_T1_SR_stac.json']
A list of measurements available within the dataset:
[4]:
datasets[0].measurements
[4]:
{'SR_B1': {'path': 'LC09_L2SP_177043_20220131_20220202_02_T1_SR_B1.TIF'},
'SR_B2': {'path': 'LC09_L2SP_177043_20220131_20220202_02_T1_SR_B2.TIF'},
'SR_B3': {'path': 'LC09_L2SP_177043_20220131_20220202_02_T1_SR_B3.TIF'},
'SR_B4': {'path': 'LC09_L2SP_177043_20220131_20220202_02_T1_SR_B4.TIF'},
'SR_B5': {'path': 'LC09_L2SP_177043_20220131_20220202_02_T1_SR_B5.TIF'},
'SR_B6': {'path': 'LC09_L2SP_177043_20220131_20220202_02_T1_SR_B6.TIF'},
'SR_B7': {'path': 'LC09_L2SP_177043_20220131_20220202_02_T1_SR_B7.TIF'},
'QA_PIXEL': {'path': 'LC09_L2SP_177043_20220131_20220202_02_T1_QA_PIXEL.TIF'},
'QA_RADSAT': {'path': 'LC09_L2SP_177043_20220131_20220202_02_T1_QA_RADSAT.TIF'},
'SR_QA_AEROSOL': {'path': 'LC09_L2SP_177043_20220131_20220202_02_T1_SR_QA_AEROSOL.TIF'}}
The dataset’s native coordinate reference system (CRS) and geotransform:
[5]:
datasets[0].crs
[5]:
CRS('epsg:32635')
[6]:
datasets[0].transform
[6]:
Affine(226830.0, 0.0, 581385.0,
0.0, -231030.0, 2831715.0)
Other important metadata fields that can be used to query and search for data can be accessed using the metadata
property:
[7]:
dir(datasets[0].metadata)
[7]:
['cloud_cover',
'collection_category',
'creation_dt',
'creation_time',
'crs_raw',
'data_coverage',
'eo_gsd',
'eo_sun_azimuth',
'eo_sun_elevation',
'format',
'grid_spatial',
'id',
'instrument',
'label',
'lat',
'lon',
'measurements',
'platform',
'product_family',
'region_code',
'rmse',
'rmse_x',
'rmse_y',
'sat_orbit_state',
'sat_relative_orbit',
'sources',
'time']