Data Model

Dataset

“The smallest aggregation of data independently described, inventoried, and managed.”​

—Definition of “Granule” from NASA EarthData Unified Metadata Model​

Examples of ODC Datasets:​

  • a Landsat Scene​
  • an Albers Equal Area tile portion of a Landsat Scene​

Product

Products are collections of datasets that share the same set of measurements and some subset of metadata.

digraph product { graph [rankdir=TB]; node [shape=record,style=filled,fillcolor=gray95]; edge [dir=back, arrowhead=normal]; Product -> Measurements [arrowhead=diamond,style=dashed,label="conceptual "]; GridSpec -> CRS; Dataset -> Measurements; Product -> Dataset [arrowhead=diamond]; Product -> GridSpec [label="optional\nshould exist for managed products", style=dashed]; Dataset -> CRS; Dataset[label = "{Dataset|+ dataset_type\l+ local_path\l+ bounds\l+ crs\l+ measurements\l+ time\l...|...}"]; Product [label="{Product/DatasetType|+ name\l+ managed\l+ grid_spec (optional)\l+ dimensions\l...|...}"]; }

Metadata Types

Metadata Types define custom index search fields across products. The default eo metadata type defines fields such as ‘platform’, ‘instrument’ and the spatial bounds.

How the Index Works

@startuml
title Initialise Database
participant Test
participant Index
participant PostgresDb
participant PostgresDbAPI


note over Index: Entry point is ""connect()""\nin ""index/_api.py""
note over PostgresDb: In ""postgres/_connections.py"""
note over PostgresDbAPI: In ""postgres/_api.py"""
note over PostgresDbInTransaction,PostgresDbConnection: In ""postgres/_connections.py"""

== Create the Database Connection ==

Test -> Index: index_connect()
create PostgresDb
Index -> PostgresDb: from_config()
activate PostgresDb
    PostgresDb -> PostgresDb: create()
    activate PostgresDb

        PostgresDb -> PostgresDb: _create_engine()
        activate PostgresDb

            PostgresDb -> SQLAlchemy: create_engine()
            SQLAlchemy --> PostgresDb: returns an engine

        deactivate PostgresDb
    deactivate PostgresDb

    PostgresDb --> Index: database ready for use
deactivate PostgresDb
activate Index

    Index --> Test: here, have an Index
deactivate Index

== Use the Database Connection ==

note over Test: Using begin() starts a transaction
Test -> PostgresDb: begin()
activate PostgresDb
    PostgresDb -> PostgresDbInTransaction: _enter__()
    PostgresDbInTransaction -> PostgresDbAPI: Construct
    PostgresDbAPI --> PostgresDbInTransaction: self
    PostgresDbInTransaction --> PostgresDb: return PostgresDbAPI

    PostgresDb --> Test: a PostgresDbAPI

deactivate PostgresDb


note over Test: Using connect() does **NOT** start a transaction
Test -> PostgresDb: connect()
activate PostgresDb
    PostgresDb -> PostgresDbConnection: _enter__()
    PostgresDbConnection -> PostgresDbAPI: Construct
    PostgresDbAPI --> PostgresDbConnection: self
    PostgresDbConnection --> PostgresDb: return PostgresDbAPI

    PostgresDb --> Test: a PostgresDbAPI

deactivate PostgresDb

== Create an Index using the Connection ==

Test -> Index: dunder_init()

Test -> Index: init_db()
activate Index
    Index -> PostgresDb: init()
    activate PostgresDb
        PostgresDb -> _core: ensure_db(self._engine)
        activate _core
            _core -> SQLAlchemy: engine.connect()
            SQLAlchemy --> _core: returns a connection
            database Server
            _core -> Server: create roles
            _core -> Server: create schema
            _core -> Server: create types
            _core -> Server: create tables
            _core -> Server: create grants

            _core --> PostgresDb: True if schema was initialised
        
        deactivate _core

    deactivate PostgresDb
deactivate Index


== Initialise the Database Using an Index ==

'Test -> PostgresDb: init()



@enduml

Fig. 6 Sequence of steps when creating an index