Ubuntu¶
Python venv Installation¶
Ubuntu 20.04 includes fairly recent geospatial packages, so it is much more practical to create “native” Python virtual environments for running Datacube. One no longer needs to rely on conda.
Required Software¶
Many Python modules are now shipped with pre-compiled binaries, but some still
require compilation during installation. Library for parsing YAML documents
(libyaml-dev
), and library for to talking to PostgreSQL database (libpq-dev
) are
such examples.
apt-get install -y \
build-essential \
python3-dev \
python3-pip \
python3-venv \
libyaml-dev \
libpq-dev
Datacube uses rasterio
, shapely
and pyproj
geospatial libraries.
Those can be installed in binary form, however it is possible that binary
versions of those libraries are incompatible with each other as they might ship
slightly different versions of GDAL
or other libraries. It is safest to
compile those libraries during installation instead. For that we need to install
geospatial and netcdf libraries and tools. Include fortran, chances are some
numeric lib will need it.
apt-get install -y \
libproj-dev \
proj-bin \
libgdal-dev \
libgeos-dev \
libgeos++-dev \
libudunits2-dev \
libnetcdf-dev \
libhdf4-alt-dev \
libhdf5-serial-dev \
gfortran
Optional packages (useful utilities, docs)
apt-get install postgresql-doc libhdf5-doc netcdf-doc libgdal-doc
apt-get install hdf5-tools netcdf-bin gdal-bin pgadmin3
Creating Python Virtual Environment¶
This example uses virtual environment, installation into system python is not
recommended. First we create a new virtual environment called odc
and update
some foundational packages.
python3 -m venv odc
./odc/bin/python3 -m pip install -U pip setuptools
./odc/bin/python3 -m pip install -U wheel 'setuptools_scm[toml]' cython
Install datacube
, making sure that important dependencies are compiled
locally to ensure binary compatibility. Version 3 of pyproj
requires more
recent version of PROJ
C library than what is available in Ubuntu
repositories, so we limit pyproj
to 2.x.x series.
./odc/bin/python3 -m pip install -U \
'pyproj==2.*' \
'datacube[all]' \
--no-binary=rasterio,pyproj,shapely,fiona,psycopg2,netCDF4,h5py
I you omit --no-binary=...
flag you will get pre-compiled version of
geospatial libs. Installation will be quicker, but Python environment will be
somewhat larger due to duplicate copies of some C libraries. More importantly
you might get random segfaults if rasterio
and pyproj
include
incompatible binary dependencies.
Run some basic checks:
./odc/bin/datacube --help
./odc/bin/rio --help
Datacube no longer depends on GDAL Python bindings, but if your code needs them, they can be easily installed like so
./odc/bin/python -m pip install GDAL==$(gdal-config --version)
It is important to install exactly the right version of python bindings, it must
match the version of the system GDAL, hence GDAL==$(gdal-config --version)
.
Miniconda¶
Datacube is also available via conda-forge
channel for installation in a
Conda environment. So if you prefer or need to use Conda rather than system
Python, follow instructions below:
Download and install Miniconda using the following instructions https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html
Open your favourite terminal to execute the following commands.
Add the conda-forge channel
conda config --add channels conda-forge
The conda-forge channel provides multitude of community maintained packages. Find out more about it here https://conda-forge.org/
Create a virtual environment in conda
conda create --name cubeenv python=3.6 datacube
Activate the virtual environment
source activate cubeenv
Find out more about managing virtual environments here https://conda.io/docs/using/envs.html
Install other packages
conda install jupyter matplotlib scipy
Find out more about managing packages here https://conda.io/docs/using/pkgs.html
Datacube is now installed and can be used in a terminal by activating the cubeenv environment.