Python venv Installation¶
Ubuntu 20.04 includes fairly recent geospatial packages, so it is much more practical to create “native” Python virtual environments for running Datacube. One no longer needs to rely on conda.
Many Python modules are now shipped with pre-compiled binaries, but some still
require compilation during installation. The library for parsing YAML documents
libyaml-dev), and the library for to talking to a PostgreSQL database (
apt-get install -y \ build-essential \ python3-dev \ python3-pip \ python3-venv \ libyaml-dev \ libpq-dev
The Datacube uses the
pyproj geospatial libraries.
Those can be installed in binary form, however it is possible that binary
versions of those libraries are incompatible with each other as they might ship
slightly different versions of
GDAL or other libraries. It is safest to
compile those libraries during installation instead. For that we need to install
geospatial and netcdf libraries and tools. Include fortran, chances are some
numeric lib will need it.
apt-get install -y \ libproj-dev \ proj-bin \ libgdal-dev \ libgeos-dev \ libgeos++-dev \ libudunits2-dev \ libnetcdf-dev \ libhdf4-alt-dev \ libhdf5-serial-dev \ gfortran
Optional packages (useful utilities, docs)
apt-get install postgresql-doc libhdf5-doc netcdf-doc libgdal-doc apt-get install hdf5-tools netcdf-bin gdal-bin pgadmin3
Creating Python Virtual Environment¶
This example uses a virtual environment, as installation into system python is definitely not
recommended. First we create a new virtual environment called
odc and update
some foundational packages.
python3 -m venv odc ./odc/bin/python3 -m pip install -U pip setuptools ./odc/bin/python3 -m pip install -U wheel 'setuptools_scm[toml]' cython
datacube, making sure that important dependencies are compiled
locally to ensure binary compatibility. Version 3 of
pyproj requires a more
recent version of the
PROJ C library than what is available in Ubuntu
repositories, so we limit
pyproj to 2.x.x series.
./odc/bin/python3 -m pip install -U \ 'pyproj==2.*' \ 'datacube[all]' \ --no-binary=rasterio,pyproj,shapely,fiona,psycopg2,netCDF4,h5py
If you omit the
--no-binary=... flag you will get a pre-compiled version of
geospatial libs. Installation will be quicker, but the Python environment will be
somewhat larger due to duplicate copies of some C libraries. More importantly
you might get random segfaults if
incompatible binary dependencies.
Run some basic checks:
./odc/bin/datacube --help ./odc/bin/rio --help
Datacube no longer depends on GDAL Python bindings, but if your code needs them, they can be easily installed like so
./odc/bin/python -m pip install GDAL==$(gdal-config --version)
It is important to install exactly the right version of python bindings, it must
match the version of the system GDAL, hence
Datacube is also available via the
conda-forge channel for installation in a
Conda environment. So if you prefer or need to use Conda rather than system
Python, follow the instructions below:
Download and install Miniconda using the following instructions https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html
Open your favourite terminal to execute the following commands.
Add the conda-forge channel
conda config --add channels conda-forge
The conda-forge channel provides multitude of community maintained packages. Find out more about it here https://conda-forge.org/
Create a virtual environment in conda
conda create --name cubeenv python=3.8 datacube
Activate the virtual environment
source activate cubeenv
Find out more about managing virtual environments here https://conda.io/docs/using/envs.html
Install other packages
conda install jupyter matplotlib scipy
Find out more about managing packages here https://conda.io/docs/using/pkgs.html
Datacube is now installed and can be used in a terminal by activating the cubeenv environment.