datacube.index.abstract#
- class datacube.index.abstract.AbstractDatasetResource(index)[source]#
Abstract base class for the Dataset portion of an index api.
All DatasetResource implementations should inherit from this base class and implement all abstract methods.
(If a particular abstract method is not applicable for a particular implementation raise a NotImplementedError)
- abstract add(dataset, with_lineage=True, archive_less_mature=None)[source]#
Add
dataset
to the index. No-op if it is already present.- Parameters:
dataset (
Dataset
) – Unpersisted dataset modelwith_lineage (
bool
) –True (default)
attempt adding lineage datasets if missingFalse
record lineage relations, but do not attempt adding lineage datasets to the db
archive_less_mature (
int
|None
) – if integer, search for less mature versions of the dataset with the int value as a millisecond delta in timestamp comparison
- Return type:
- Returns:
Persisted Dataset model
- archive_less_mature(ds, delta=500)[source]#
Archive less mature versions of a dataset
- Return type:
- Parameters:
If True, default to 500ms. If False, do not find or archive less mature datasets. Bool value accepted only for improving backwards compatibility, int preferred.
- abstract archive_location(id_, uri)[source]#
Archive a location of the dataset if it exists and is active.
- bulk_add(datasets, batch_size=1000)[source]#
Add a group of Dataset documents in bulk.
API Note: This API method is not finalised and may be subject to change.
- Parameters:
datasets (
Iterable
[DatasetTuple
]) – An Iterable of DatasetTuples (i.e. as returned by get_all_docs)batch_size (
int
) – Number of datasets to add per batch (default 1000)
- Return type:
- Returns:
BatchStatus named tuple, with safe set to None.
- abstract bulk_has(ids_)[source]#
Like has but operates on a multiple ids.
For every supplied id check if database contains a dataset with that id.
- abstract can_update(dataset, updates_allowed=None)[source]#
Check if dataset can be updated. Return bool,safe_changes,unsafe_changes
- Parameters:
dataset (Dataset) – Dataset to update
updates_allowed (
Optional
[Mapping
[Tuple
[Union
[str
,int
],...
],Callable
[[Tuple
[Union
[str
,int
],...
],Tuple
[Union
[str
,int
],...
],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]]],bool
]]]) – Allowed updates
- Return type:
tuple
[bool
,Iterable
[Tuple
[Tuple
[Union
[str
,int
],...
],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]]]],Iterable
[Tuple
[Tuple
[Union
[str
,int
],...
],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]]]]]- Returns:
Tuple of: boolean (can/can’t update); safe changes; unsafe changes
- abstract count(archived=False, **query)[source]#
Perform a search, returning count of results.
- Parameters:
archived (
bool
|None
) – False (default): Count active datasets only. None: Count archived and active datasets. True: Count archived datasets only.geopolygon – Spatial search polygon (only supported if index supports_spatial_indexes)
query (
str
|float
|int
|Range
|datetime
|Not
) – search query parameters
- Return type:
- Returns:
Count of matching datasets in index
- abstract count_by_product(archived=False, **query)[source]#
Perform a search, returning a count of for each matching product type.
- Parameters:
geopolygon – Spatial search polygon (only supported if index supports_spatial_indexes)
archived (
bool
|None
) – False (default): Count active datasets only. None: Count archived and active datasets. True: Count archived datasets only.query (
str
|float
|int
|Range
|datetime
|Not
) – search query parameters
- Return type:
- Returns:
Counts of matching datasets in index, grouped by product.
- abstract count_by_product_through_time(period, archived=False, **query)[source]#
Perform a search, returning counts for each product grouped in time slices of the given period.
- Parameters:
- Return type:
- Returns:
For each matching product type, a list of time ranges and their count.
- abstract count_product_through_time(period, archived=False, **query)[source]#
Perform a search, returning counts for a single product grouped in time slices of the given period.
Will raise an error if the search terms match more than one product.
- Parameters:
period (
str
) – Time range for each slice: ‘1 month’, ‘1 day’ etc.archived (
bool
|None
) – False (default): Count active datasets only. None: Count archived and active datasets. True: Count archived datasets only.geopolygon – Spatial search polygon (only supported if index supports_spatial_indexes)
query (
str
|float
|int
|Range
|datetime
|Not
) – search query parameters
- Return type:
- Returns:
The product, a list of time ranges and the count of matching datasets.
- find_less_mature(ds, delta=500)[source]#
Find less mature versions of a dataset
If True, default to 500ms. If None or False, do not find or archive less mature datasets. Bool value accepted only for improving backwards compatibility, int preferred. :rtype:
Iterable
[Dataset
] :return: Iterable of less mature datasets
- get(id_, include_sources=False, include_deriveds=False, max_depth=0)[source]#
Get dataset by id (Return None if
id_
does not exist).Index drivers supporting the legacy lineage API:
- Parameters:
id – id of the dataset to retrieve
include_sources (
bool
) – include the full provenance tree of the dataset.
Index drivers supporting the external lineage API:
- Parameters:
- Return type:
Dataset model (None if not found)
- abstract get_all_dataset_ids(archived)[source]#
Get all dataset IDs based only on archived status
This will be very slow and inefficient for large databases, and is really only intended for small and/or experimental databases.
- get_all_docs(products=None, batch_size=1000)[source]#
Return all datasets in bulk, filtering by product names only. Do not instantiate models. Archived datasets and locations are excluded.
API Note: This API method is not finalised and may be subject to change.
- abstract get_archived_location_times(id_)[source]#
Get each archived location along with the time it was archived.
- abstract get_datasets_for_location(uri, mode=None)[source]#
Find datasets that exist at the given URI
- get_field_names(product_name=None)[source]#
Get the list of possible search fields for a Product (or all products)
- get_product_time_bounds(product)[source]#
Returns the minimum and maximum acquisition time of the product.
- abstract get_unsafe(id_, include_sources=False, include_deriveds=False, max_depth=0)[source]#
Get dataset by id (Raises KeyError if id_ does not exist)
Index drivers supporting the legacy lineage API:
- Parameters:
id – id of the dataset to retrieve
include_sources (
bool
) – include the full provenance tree of the dataset.
Index drivers supporting the external lineage API:
- Parameters:
- Return type:
Dataset model (None if not found)
- abstract has(id_)[source]#
Is this dataset in this index?
- Parameters:
id – dataset id
- Return type:
- Returns:
True if the dataset exists in this index
- abstract search(limit=None, source_filter=None, archived=False, order_by=None, **query)[source]#
Perform a search, returning results as Dataset objects.
Prior to dataccube-1.9.0, search always returned datasets sorted by product. From 1.9, no ordering is guaranteed. Ordering of results is now unspecified and may vary between index drivers.
- Parameters:
limit (
int
|None
) – Limit number of datasets per product (None/default = unlimited)archived (
bool
|None
) – False (default): Return active datasets only. None: Include archived and active datasets. True: Return archived datasets only.order_by (
Optional
[Iterable
[Any
]]) – field or expression by which to order resultsgeopolygon – Spatial search polygon (only supported if index supports_spatial_indexes)
query (
str
|float
|int
|Range
|datetime
|Not
) – search query parameters
- Return type:
- Returns:
Matching datasets
- abstract search_by_metadata(metadata, archived=False)[source]#
Perform a search using arbitrary metadata, returning results as Dataset objects.
Caution – slow! This will usually not use indexes.
- Parameters:
metadata (
dict
[str
,None
|bool
|str
|float
|int
|list
[JsonLike] |dict
[str
, JsonLike]]) – metadata dictionary representing arbitrary search queryarchived (
bool
|None
) – False (default): Return active datasets only. None: Include archived and active datasets. True: Return archived datasets only.
- Return type:
- Returns:
Matching dataset models
- abstract search_by_product(archived=False, **query)[source]#
Perform a search, returning datasets grouped by product type.
- Parameters:
archived (
bool
|None
) – False (default): Return active datasets only. None: Include archived and active datasets. True: Return archived datasets only.geopolygon – Spatial search polygon (only supported if index supports_spatial_indexes)
query (
str
|float
|int
|Range
|datetime
|Not
) – search query parameters
- Return type:
- Returns:
Matching datasets, grouped by Product
- abstract search_product_duplicates(product, *args)[source]#
Find dataset ids who have duplicates of the given set of field names.
(Search is always restricted by Product)
Returns a generator returning a tuple containing a namedtuple of the values of the supplied fields, and the datasets that match those values.
- abstract search_returning(field_names=None, custom_offsets=None, limit=None, archived=False, order_by=None, **query)[source]#
Perform a search, returning only the specified fields.
This method can be faster than normal search() if you don’t need all fields of each dataset.
It also allows for returning rows other than datasets, such as a row per uri when requesting field ‘uri’.
- Parameters:
field_names (
Optional
[Iterable
[str
]]) – Names of desired fields (default = all known search fields, unless custom_offsets is set, see below)custom_offsets (
Optional
[Mapping
[str
,Tuple
[Union
[str
,int
],...
]]]) – A dictionary of offsets in the metadata doc for custom fields custom offsets are returned in addition to fields named in field_names. Default is None, field_names only. If field_names is None, and custom_offsets are provided, only the custom offsets are included, over-riding the normal field_names default.limit (
int
|None
) – Limit number of dataset (None/default = unlimited)archived (
bool
|None
) – False (default): Return active datasets only. None: Include archived and active datasets. True: Return archived datasets only.order_by (
Optional
[Iterable
[Any
]]) – a field name, field, function or clause by which to sort output. None is unsorted and may allow faster return of first result depending on the index driver’s implementation.geopolygon – Spatial search polygon (only supported if index supports_spatial_indexes)
query (
str
|float
|int
|Range
|datetime
|Not
) – search query parameters
- Return type:
- Returns:
Namedtuple of requested fields, for each matching dataset.
- abstract search_returning_datasets_light(field_names, custom_offsets=None, limit=None, archived=False, **query)[source]#
This is a dataset search function that returns the results as objects of a dynamically generated Dataset class that is a subclass of tuple.
Only the requested fields will be returned together with related derived attributes as property functions similar to the datacube.model.Dataset class. For example, if ‘extent’is requested all of ‘crs’, ‘extent’, ‘transform’, and ‘bounds’ are available as property functions.
The field_names can be custom fields in addition to those specified in metadata_type, fixed fields, or native fields. The field_names can also be derived fields like ‘extent’, ‘crs’, ‘transform’, and ‘bounds’. The custom fields require custom offsets of the metadata doc be provided.
The datasets can be selected based on values of custom fields as long as relevant custom offsets are provided. However custom field values are not transformed so must match what is stored in the database.
- Parameters:
field_names (
tuple
[str
,...
]) – A tuple of field names that would be returned including derived fields such as extent, crscustom_offsets (
Optional
[Mapping
[str
,Tuple
[Union
[str
,int
],...
]]]) – A dictionary of offsets in the metadata doc for custom fieldslimit (
int
|None
) – Number of datasets returned per product.archived (
bool
|None
) – False (default): Return active datasets only. None: Return archived and active datasets. True: Return archived datasets only.query (
str
|float
|int
|Range
|datetime
|Not
) – query parameters that will be processed against metadata_types, product definitions and/or dataset table.
- Return type:
- Returns:
A Dynamically generated DatasetLight (a subclass of namedtuple and possibly with property functions).
- abstract search_summaries(**query)[source]#
Perform a search, returning just the search fields of each dataset.
- abstract spatial_extent(ids, crs=CRS('EPSG:4326'))[source]#
Return the combined spatial extent of the nominated datasets
Uses spatial index.
Returns None if no index for the CRS, or if no identified datasets are indexed in the relevant spatial index. Result will not include extents of datasets that cannot be validly projected into the CRS.
- abstract temporal_extent(ids)[source]#
Returns the minimum and maximum acquisition time of an iterable of dataset ids.
Raises KeyError if none of the datasets are in the index
- abstract update(dataset, updates_allowed=None, archive_less_mature=None)[source]#
Update dataset metadata and location :param Dataset dataset: Dataset model with unpersisted updates :type updates_allowed:
Optional
[Mapping
[Tuple
[Union
[str
,int
],...
],Callable
[[Tuple
[Union
[str
,int
],...
],Tuple
[Union
[str
,int
],...
],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]]],bool
]]] :param updates_allowed: Allowed updates :type archive_less_mature:int
|None
:param archive_less_mature: Find and archive less mature datasets with ms delta :rtype:Dataset
:return: Persisted dataset model
- class datacube.index.abstract.AbstractIndex[source]#
Abstract base class for an Index. All Index implementations should inherit from this base class, and implement all abstract methods (and override other methods and contract flags as required.
- clone(origin_index, batch_size=1000, skip_lineage=False, lineage_only=False)[source]#
Clone an existing index into this one.
Steps are:
Clone all metadata types compatible with this index driver. - Products and Datasets with incompatible metadata types are excluded from subsequent steps. - Existing metadata types are skipped, but products and datasets associated with them are only
excluded if the existing metadata type does not match the one from the origin index.
Clone all products with “safe” metadata types. - Products are included or excluded by metadata type as discussed above. - Existing products are skipped, but datasets associated with them are only
excluded if the existing product definition does not match the one from the origin index.
Clone all datasets with “safe” products - Datasets are included or excluded by product and metadata type, as discussed above. - Archived datasets and locations are not cloned.
Clone all lineage relations that can be cloned. - All lineage relations are skipped if either index driver does not support lineage,
or if skip_lineage is True.
If this index does not support external lineage then lineage relations that reference datasets that do not exist in this index after step 3 above are skipped.
API Note: This API method is not finalised and may be subject to change.
- Parameters:
origin_index (
AbstractIndex
) – Index whose contents we wish to clone.batch_size (
int
) – Maximum number of objects to write to the database in one go.
- Return type:
- Returns:
Dictionary containing a BatchStatus named tuple for “metadata_types”, “products” and “datasets”, and optionally “lineage”.
- create_spatial_index(crs)[source]#
Create a spatial index for a CRS.
Note that a newly created spatial index is empty. If there are already datasets in the index whose extents can be safely projected into the CRS, then it is necessary to also call update_spatial_index otherwise they will not be found by queries against that CRS.
Only implemented by index drivers with supports_spatial_indexes set to True.
- Parameters:
crs (
CRS
) – The coordinate reference system to create a spatial index for.- Return type:
- Returns:
True if the spatial index was successfully created (or already exists)
- abstract property datasets: AbstractDatasetResource#
A Dataset Resource instance for the index
- drop_spatial_index(crs)[source]#
Remove a spatial index from the database.
Note that creating spatial indexes on an existing index is a slow and expensive operation. Do not delete spatial indexes unless you are absolutely certain it is no longer required by any users of this ODC index.
- Parameters:
crs (
CRS
) – The CRS whose spatial index is to be deleted.- Return type:
- Returns:
True if the spatial index was successfully dropped. False if spatial index could not be dropped.
- abstract property environment: ODCEnvironment#
The cfg.ODCEnvironment object this Index was initialised from.
- abstract classmethod from_config(cfg_env, application_name=None, validate_connection=True)[source]#
Instantiate a new index from an ODCEnvironment configuration object
- Return type:
- abstract classmethod get_dataset_fields(doc)[source]#
Return dataset search fields from a metadata type document
- abstract property index_id: str#
- Returns:
Unique ID for this index (e.g. same database/dataset storage + same index driver implementation = same id)
- abstract init_db(with_default_types=True, with_permissions=True)[source]#
Initialise an empty database.
- abstract property lineage: AbstractLineageResource#
A Lineage Resource instance for the index
- abstract property metadata_types: AbstractMetadataTypeResource#
A MetadataType Resource instance for the index
- abstract property products: AbstractProductResource#
A Product Resource instance for the index
- spatial_indexes(refresh=False)[source]#
Return the CRSs for which spatial indexes have been created.
- Parameters:
refresh – If true, query the backend for the list of current spatial indexes. If false (the default) a cached list of spatial index CRSs may be returned.
- Return type:
Iterable
[CRS
]- Returns:
An iterable of CRSs for which spatial indexes exist in the index
- thread_transaction()[source]#
- Return type:
- Returns:
The existing Transaction object cached in thread-local storage for this index, if there is one.
- update_spatial_index(crses=[], product_names=[], dataset_ids=[])[source]#
Populate a newly created spatial index (or indexes).
Spatial indexes are automatically populated with new datasets as they are indexed, but if there were datasets already in the index when a new spatial index is created, or if geometries have been added or modified outside of the ODC in a populated index (e.g. with SQL) then the spatial indexies must be updated manually with this method.
This is a very slow operation. The product_names and dataset_ids lists can be used to break the operation up into chunks or allow faster updating when the spatial index is only relevant to a small portion of the entire index.
- Parameters:
crses (
Sequence
[CRS
]) – A list of CRSes whose spatial indexes are to be updated. Default is to update all spatial indexesproduct_names (
Sequence
[str
]) – A list of product names to update the spatial indexes. Default is to update for all productsdataset_ids (
Sequence
[str
|UUID
]) – A list of ids of specific datasets to update in the spatial index. Default is to update for all datasets (or all datasts in the products in the product_names list)
- Return type:
- Returns:
The number of dataset extents processed - i.e. the number of datasets updated multiplied by the number of spatial indexes updated.
- abstract property users: AbstractUserResource#
A User Resource instance for the index
- class datacube.index.abstract.AbstractIndexDriver[source]#
Abstract base class for an IndexDriver. All IndexDrivers should inherit from this base class and implement all abstract methods.
- class datacube.index.abstract.AbstractLineageResource(index)[source]#
Abstract base class for the Lineage portion of an index api.
All LineageResource implementations should inherit from this base class.
Note that this is a “new” resource only supported by new index drivers with supports_external_lineage set to True. If a driver does NOT support external lineage, it can use extend the NoLineageResource class below, which is a minimal implementation of this resource that raises a NotImplementedError for all methods.
However, any index driver that supports lineage must implement at least the get_all_lineage() and _add_batch() methods.
- abstract add(tree, max_depth=0, allow_updates=False)[source]#
Add or update a LineageTree into the Index.
If the provided tree is inconsistent with lineage data already recorded in the database, by default a ValueError is raised, If replace is True, the provided tree is treated as authoritative and the database is updated to match.
- Parameters:
tree (
LineageTree
) – The LineageTree to add to the indexmax_depth (
int
) – Maximum recursion depth. Default/Zero = unlimited depthallow_updates (
bool
) – If False and the tree would require index updates to fully add, then raise an InconsistentLineageException.
- Return type:
- bulk_add(relations, batch_size=1000)[source]#
Add a group of LineageRelation objects in bulk.
API Note: This API method is not finalised and may be subject to change.
- Parameters:
- Return type:
- Returns:
BatchStatus named tuple, with safe set to None.
- abstract clear_home(*args, home=None)[source]#
Clear the home for one or more dataset ids, or all dataset ids that currently have a particular home value.
- abstract get_all_lineage(batch_size=1000)[source]#
Perform a batch-read of all lineage relations (as used by index clone operation) and return as an iterable stream of LineageRelation objects.
API Note: This API method is not finalised and may be subject to change.
- abstract get_derived_tree(id_, max_depth=0)[source]#
- Extract a LineageTree from the index, with:
“id” at the root of the tree.
“derived” direction (i.e. datasets derived from id, datasets derived from datasets derived from id, etc.)
maximum depth as requested (default 0 = unlimited depth)
Tree may be empty (i.e. just the root node) if no lineage for id is stored.
- Parameters:
id – the id of the dataset at the root of the returned tree
max_depth (
int
) – Maximum recursion depth. Default/Zero = unlimited depth
- Return type:
- Returns:
A derived-direction Lineage tree with id at the root.
- abstract get_homes(*args)[source]#
Obtain a dictionary mapping UUIDs to home strings for the passed in DSIDs.
If a passed in DSID does not have a home set in the database, it will not be included in the returned mapping. i.e. a database index with no homes recorded will always return an empty mapping.
- abstract get_source_tree(id_, max_depth=0)[source]#
- Extract a LineageTree from the index, with:
“id” at the root of the tree.
“source” direction (i.e. datasets id was derived from, the dataset ids THEY were derived from, etc.)
maximum depth as requested (default 0 = unlimited depth)
Tree may be empty (i.e. just the root node) if no lineage for id is stored.
- Parameters:
id – the id of the dataset at the root of the returned tree
max_depth (
int
) – Maximum recursion depth. Default/Zero = unlimited depth
- Return type:
- Returns:
A source-direction Lineage tree with id at the root.
- abstract merge(rels, allow_updates=False, validate_only=False)[source]#
Merge an entire LineageRelations collection into the database.
- Parameters:
rels (
LineageRelations
) – The LineageRelations collection to merge.allow_updates (
bool
) – If False and the merging rels would require index updates, then raise an InconsistentLineageException.validate_only (
bool
) – If True, do not actually merge the LineageRelations, just check for inconsistency. allow_updates and validate_only cannot both be True
- Return type:
- abstract remove(id_, direction, max_depth=0)[source]#
Remove lineage information from the Index.
Removes lineage relation data only. Home values not affected.
- Parameters:
id – The Dataset ID to start removing lineage from.
direction (
LineageDirection
) – The direction in which to remove lineage (from id_)max_depth (
int
) – The maximum depth to which to remove lineage (0/default = no limit)
- Return type:
- class datacube.index.abstract.AbstractMetadataTypeResource[source]#
Abstract base class for the MetadataType portion of an index api.
All MetadataTypeResource implementations should inherit from this base class and implement all abstract methods.
(If a particular abstract method is not applicable for a particular implementation raise a NotImplementedError)
- abstract add(metadata_type, allow_table_lock=False)[source]#
Add a metadata type to the index.
- Parameters:
metadata_type (
MetadataType
) – Unpersisted Metadatatype modelallow_table_lock (
bool
) –Allow an exclusive lock to be taken on the table while creating the indexes. This will halt other user’s requests until completed.
If false, creation will be slightly slower and cannot be done in a transaction.
raise NotImplementedError if set to True, and this behaviour is not applicable for the implementing driver.
- Return type:
- Returns:
Persisted Metadatatype model.
- bulk_add(metadata_docs, batch_size=1000)[source]#
Add a group of Metadata Type documents in bulk.
API Note: This API method is not finalised and may be subject to change.
- Parameters:
- Return type:
- Returns:
BatchStatus named tuple, with safe containing a list of metadata type names that are safe to include in a subsequent product bulk add.
- abstract can_update(metadata_type, allow_unsafe_updates=False)[source]#
Check if metadata type can be updated. Return bool,safe_changes,unsafe_changes
Safe updates currently allow new search fields to be added, description to be changed.
- Parameters:
metadata_type (
MetadataType
) – updated MetadataTypeallow_unsafe_updates (
bool
) – Allow unsafe changes. Use with caution.
- Return type:
tuple
[bool
,Iterable
[Tuple
[Tuple
[Union
[str
,int
],...
],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]]]],Iterable
[Tuple
[Tuple
[Union
[str
,int
],...
],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]]]]]- Returns:
Tuple of: boolean (can/can’t update); safe changes; unsafe changes
- abstract check_field_indexes(allow_table_lock=False, rebuild_views=False, rebuild_indexes=False)[source]#
Create or replace per-field indexes and views.
May have no effect if not relevant for this index implementation
- Parameters:
allow_table_lock (
bool
) –Allow an exclusive lock to be taken on the table while creating the indexes. This will halt other user’s requests until completed.
If false, creation will be slightly slower and cannot be done in a transaction.
- Param:
rebuild_views: whether or not views should be rebuilt
- Param:
rebuild_indexes: whether or not views should be rebuilt
- Return type:
- get(id_)[source]#
Fetch metadata type by id.
- Return type:
- Returns:
MetadataType model or None if not found
- abstract get_all()[source]#
Retrieve all Metadata Types
- Return type:
- Returns:
All available MetadataType models
- get_all_docs()[source]#
Retrieve all Metadata Types as documents only (e.g. for an index clone)
Default implementation calls self.get_all()
API Note: This API method is not finalised and may be subject to change.
- get_by_name(name)[source]#
Fetch metadata type by name.
- Return type:
- Returns:
MetadataType model or None if not found
- abstract get_by_name_unsafe(name)[source]#
Fetch metadata type by name
- abstract get_unsafe(id_)[source]#
Fetch metadata type by id
- Parameters:
id
- Return type:
- Returns:
metadata type model
- Raises:
KeyError – if not found
- get_with_fields(field_names)[source]#
Return all metadata types that have all of the named search fields.
- abstract update(metadata_type, allow_unsafe_updates=False, allow_table_lock=False)[source]#
Update a metadata type from the document. Unsafe changes will throw a ValueError by default.
Safe updates currently allow new search fields to be added, description to be changed.
- Parameters:
metadata_type (
MetadataType
) – MetadataType model with unpersisted updatesallow_unsafe_updates (
bool
) – Allow unsafe changes. Use with caution.allow_table_lock (
bool
) –Allow an exclusive lock to be taken on the table while creating the indexes. This will halt other user’s requests until completed.
If false, creation will be slower and cannot be done in a transaction.
- Return type:
- Returns:
Persisted updated MetadataType model
- class datacube.index.abstract.AbstractProductResource(index)[source]#
Abstract base class for the Product portion of an index api.
All ProductResource implementations should inherit from this base class and implement all abstract methods.
(If a particular abstract method is not applicable for a particular implementation raise a NotImplementedError)
- abstract add(product, allow_table_lock=False)[source]#
Add a product to the index.
- Parameters:
metadata_type – Unpersisted Product model
allow_table_lock (
bool
) –Allow an exclusive lock to be taken on the table while creating the indexes. This will halt other user’s requests until completed.
If false, creation will be slightly slower and cannot be done in a transaction.
raise NotImplementedError if set to True, and this behaviour is not applicable for the implementing driver.
- Return type:
- Returns:
Persisted Product model.
- bulk_add(product_docs, metadata_types=None, batch_size=1000)[source]#
Add a group of product documents in bulk.
API Note: This API method is not finalised and may be subject to change.
- Parameters:
product_docs (
Iterable
[dict
[str
,None
|bool
|str
|float
|int
|list
[None
|bool
|str
|float
|int
|list
[JsonLike] |dict
[str
, JsonLike]] |dict
[str
,None
|bool
|str
|float
|int
|list
[JsonLike] |dict
[str
, JsonLike]]]]) – An iterable of product metadata docs.batch_size (
int
) – Number of products to add per batch (default 1000)metadata_types (
dict
[str
,MetadataType
] |None
) – Optional dictionary cache of MetadataType objects. Used for product metadata validation, and for filtering. (Metadata types not in in this list are skipped.)
- Return type:
- Returns:
BatchStatus named tuple, with safe containing a list of product names that are safe to include in a subsequent dataset bulk add.
- abstract can_update(product, allow_unsafe_updates=False, allow_table_lock=False)[source]#
Check if product can be updated. Return bool,safe_changes,unsafe_changes
(An unsafe change is anything that may potentially make the product incompatible with existing datasets of that type)
- Parameters:
product (
Product
) – product to updateallow_unsafe_updates (
bool
) – Allow unsafe changes. Use with caution.allow_table_lock (
bool
) –Allow an exclusive lock to be taken on the table while creating the indexes. This will halt other user’s requests until completed.
If false, creation will be slower and cannot be done in a transaction.
- Return type:
tuple
[bool
,Iterable
[Tuple
[Tuple
[Union
[str
,int
],...
],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]]]],Iterable
[Tuple
[Tuple
[Union
[str
,int
],...
],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]],Union
[MissingSentinel
,str
,int
,None
,Sequence
[Any
],Mapping
[str
,Any
]]]]]- Returns:
Tuple of: boolean (can/can’t update); safe changes; unsafe changes
- abstract delete(products, allow_delete_active=False)[source]#
Delete the specified products.
- Parameters:
- Return type:
- Returns:
list of deleted Products
- from_doc(definition, metadata_type_cache=None)[source]#
Construct unpersisted Product model from product metadata dictionary
- Parameters:
definition (
dict
[str
,None
|bool
|str
|float
|int
|list
[None
|bool
|str
|float
|int
|list
[JsonLike] |dict
[str
, JsonLike]] |dict
[str
,None
|bool
|str
|float
|int
|list
[JsonLike] |dict
[str
, JsonLike]]]) – a Product metadata dictionarymetadata_type_cache (
dict
[str
,MetadataType
] |None
) – a dict cache of MetaDataTypes to use in constructing a Product. MetaDataTypes may come from a different index.
- Return type:
- Returns:
Unpersisted product model
- get_all_docs()[source]#
Retrieve all Product metadata documents Default implementation calls get_all()
API Note: This API method is not finalised and may be subject to change.
- get_field_names(product=None)[source]#
Get the list of possible search fields for a Product (or all products)
- get_with_types(types)[source]#
Return all products for given metadata types
- Parameters:
types (
Iterable
[MetadataType
]) – An iterable of MetadataType models- Return type:
- Returns:
An iterable of Product models
- abstract most_recent_change(product)[source]#
Finds the time of the latest change to a dataset belonging to the product. Raises KeyError if product is not in the index Returns None if product has no datasets in the index
- abstract search_by_metadata(metadata)[source]#
Perform a search using arbitrary metadata, returning results as Product objects.
Caution – slow! This will usually not use indexes.
- Parameters:
metadata (
dict
[str
,None
|bool
|str
|float
|int
|list
[None
|bool
|str
|float
|int
|list
[JsonLike] |dict
[str
, JsonLike]] |dict
[str
,None
|bool
|str
|float
|int
|list
[JsonLike] |dict
[str
, JsonLike]]]) – metadata dictionary representing arbitrary search query- Return type:
- Returns:
Matching product models
- abstract search_robust(**query)[source]#
Return dataset types that match match-able fields and dict of remaining un-matchable fields.
- abstract spatial_extent(product, crs=CRS('EPSG:4326'))[source]#
Return the combined spatial extent of the nominated product
Uses spatial index.
Returns None if no index for the CRS, or if no datasets for the product in the relevant spatial index, or if the driver does not support the spatial index api.
Result will not include extents of datasets that cannot be validly projected into the CRS.
- abstract temporal_extent(product)[source]#
Returns the minimum and maximum acquisition time of a product. Raises KeyError if product is not found, RuntimeError if product has no datasets in the index
- abstract update(product, allow_unsafe_updates=False, allow_table_lock=False)[source]#
Persist updates to a product. Unsafe changes will throw a ValueError by default.
(An unsafe change is anything that may potentially make the product incompatible with existing datasets of that type)
- Parameters:
product (
Product
) – Product model with unpersisted updatesallow_unsafe_updates (
bool
) – Allow unsafe changes. Use with caution.allow_table_lock (
bool
) –Allow an exclusive lock to be taken on the table while creating the indexes. This will halt other user’s requests until completed.
If false, creation will be slower and cannot be done in a transaction.
- Return type:
- Returns:
Persisted updated Product model
- update_document(definition, allow_unsafe_updates=False, allow_table_lock=False)[source]#
Update a metadata type from a document. Unsafe changes will throw a ValueError by default.
Safe updates currently allow new search fields to be added, description to be changed.
- Parameters:
definition (
dict
[str
,None
|bool
|str
|float
|int
|list
[None
|bool
|str
|float
|int
|list
[JsonLike] |dict
[str
, JsonLike]] |dict
[str
,None
|bool
|str
|float
|int
|list
[JsonLike] |dict
[str
, JsonLike]]]) – Updated definitionallow_unsafe_updates (
bool
) – Allow unsafe changes. Use with caution.allow_table_lock (
bool
) –Allow an exclusive lock to be taken on the table while creating the indexes. This will halt other user’s requests until completed.
If false, creation will be slower and cannot be done in a transaction.
- Return type:
- Returns:
Persisted updated Product model
- class datacube.index.abstract.AbstractTransaction(index_id)[source]#
Abstract base class for a Transaction Manager. All index implementations should extend this base class.
Thread-local storage and locks ensures one active transaction per index per thread.
- property active#
- Returns:
True if the transaction is active.
- begin()[source]#
Start a new transaction.
Raises an error if a transaction is already active for this thread.
Calls implementation-specific _new_connection() method and manages thread local storage and locks.
- Return type:
- class datacube.index.abstract.AbstractUserResource[source]#
Abstract base class for the User portion of an index api.
All UserResource implementations should inherit from this base class and implement all abstract methods.
(If a particular abstract method is not applicable for a particular implementation raise a NotImplementedError)
- abstract create_user(username, password, role, description=None)[source]#
Create a new user :type username:
str
:param username: username of the new user :type password:str
:param password: password of the new user :type role:str
:param role: default role of the the new user :type description:str
|None
:param description: optional description for the new user- Return type:
- abstract delete_user(*usernames)[source]#
Delete database users :type usernames:
str
:param usernames: usernames of users to be deleted- Return type:
- class datacube.index.abstract.BatchStatus(completed: int, skipped: int, seconds_elapsed: float, safe: Iterable[str] | None = None)[source]#
A named tuple representing the results of a batch add operation:
completed: Number of objects added to theMay be None for internal functions and for datasets.
skipped: Number of objects skipped, either because they already exist or the documents are invalid for this driver.
seconds_elapsed: seconds elapsed during the bulk add operation;
safe: an optional list of names of bulk added objects that are safe to be used for lower level bulk adds. Includes objects added, and objects skipped because they already exist in the index and are identical to the version being added. May be None for internal functions and for datasets.
Create new instance of BatchStatus(completed, skipped, seconds_elapsed, safe)
- count(value, /)#
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)#
Return first index of value.
Raises ValueError if the value is not present.
- class datacube.index.abstract.DatasetTuple(product: Product, metadata: dict[str, None | bool | str | float | int | list[JsonLike] | dict[str, JsonLike]], uri_: str | list[str])[source]#
A named tuple representing a complete dataset: - product: A Product model. - metadata: The dataset metadata document - uri_: The dataset location or list of locations
Create new instance of DatasetTuple(product, metadata, uri_)
- count(value, /)#
Return number of occurrences of value.
- index(value, start=0, stop=9223372036854775807, /)#
Return first index of value.
Raises ValueError if the value is not present.
- class datacube.index.abstract.NoLineageResource(index)[source]#
- Minimal implementation of AbstractLineageResource that raises “not implemented”
for all methods.
Index drivers that do not support lineage at all may use this implementation as is.
Index drivers that support legacy lineage should extend this implementation and provide implementations of the get_all_lineage() and _add_batch() methods.
- add(tree, max_depth=0, allow_updates=False)[source]#
Add or update a LineageTree into the Index.
If the provided tree is inconsistent with lineage data already recorded in the database, by default a ValueError is raised, If replace is True, the provided tree is treated as authoritative and the database is updated to match.
- Parameters:
tree (
LineageTree
) – The LineageTree to add to the indexmax_depth (
int
) – Maximum recursion depth. Default/Zero = unlimited depthallow_updates (
bool
) – If False and the tree would require index updates to fully add, then raise an InconsistentLineageException.
- Return type:
- bulk_add(relations, batch_size=1000)#
Add a group of LineageRelation objects in bulk.
API Note: This API method is not finalised and may be subject to change.
- Parameters:
- Return type:
- Returns:
BatchStatus named tuple, with safe set to None.
- clear_home(*args, home=None)[source]#
Clear the home for one or more dataset ids, or all dataset ids that currently have a particular home value.
- get_all_lineage(batch_size=1000)[source]#
Perform a batch-read of all lineage relations (as used by index clone operation) and return as an iterable stream of LineageRelation objects.
API Note: This API method is not finalised and may be subject to change.
- get_derived_tree(id, max_depth=0)[source]#
- Extract a LineageTree from the index, with:
“id” at the root of the tree.
“derived” direction (i.e. datasets derived from id, datasets derived from datasets derived from id, etc.)
maximum depth as requested (default 0 = unlimited depth)
Tree may be empty (i.e. just the root node) if no lineage for id is stored.
- Parameters:
- Return type:
- Returns:
A derived-direction Lineage tree with id at the root.
- get_homes(*args)[source]#
Obtain a dictionary mapping UUIDs to home strings for the passed in DSIDs.
If a passed in DSID does not have a home set in the database, it will not be included in the returned mapping. i.e. a database index with no homes recorded will always return an empty mapping.
- get_source_tree(id, max_depth=0)[source]#
- Extract a LineageTree from the index, with:
“id” at the root of the tree.
“source” direction (i.e. datasets id was derived from, the dataset ids THEY were derived from, etc.)
maximum depth as requested (default 0 = unlimited depth)
Tree may be empty (i.e. just the root node) if no lineage for id is stored.
- Parameters:
- Return type:
- Returns:
A source-direction Lineage tree with id at the root.
- merge(rels, allow_updates=False, validate_only=False)[source]#
Merge an entire LineageRelations collection into the database.
- Parameters:
rels (
LineageRelations
) – The LineageRelations collection to merge.allow_updates (
bool
) – If False and the merging rels would require index updates, then raise an InconsistentLineageException.validate_only (
bool
) – If True, do not actually merge the LineageRelations, just check for inconsistency. allow_updates and validate_only cannot both be True
- Return type:
- remove(id_, direction, max_depth=0)[source]#
Remove lineage information from the Index.
Removes lineage relation data only. Home values not affected.
- Parameters:
id – The Dataset ID to start removing lineage from.
direction (
LineageDirection
) – The direction in which to remove lineage (from id_)max_depth (
int
) – The maximum depth to which to remove lineage (0/default = no limit)
- Return type:
- class datacube.index.abstract.UnhandledTransaction(index_id)[source]#
- property active#
- Returns:
True if the transaction is active.
- begin()#
Start a new transaction.
Raises an error if a transaction is already active for this thread.
Calls implementation-specific _new_connection() method and manages thread local storage and locks.
- Return type:
- commit()#
Commit the transaction.
Raises an error if transaction is not active.
Calls implementation-specific _commit() method, and manages thread local storage and locks.
- Return type:
- datacube.index.abstract.default_metadata_type_docs(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/datacube-core/checkouts/latest/datacube/index/abstract/default-metadata-types.yaml'))[source]#
A list of the bare dictionary format of default
datacube.model.MetadataType
- Return type: