9. Detailed Design

This chapter provides the CCI Toolbox detailed design documentation. Its is generated from the docstrings that are extensively used throughout the Python code.

The documentation is generated for individual modules. Note that this modularisation reflects the effective, internal (and physical) structure of the Python code. This is not the official API, which comprises a relatively stable subset of the components, types, interfaces, and variables describes here and is described in chapter API Reference.

Each top level module documentation in the following sections provides a sub-section Description that provides the module’s purpose, contents, and possibly its usage. Module descriptions may link into Operation Specifications for further explanation and traceability of the detailed design. An optional sub-section Technical Requirements provides a mapping from URD requirements to technical requirements and software features that drove the design of a module. If available, links to verifying unit-tests are given in sub-sections called Verification. The sub-section Components lists all documented, non-private components of a module, including variables, functions, and classes.

9.1. Module cate.core.ds

9.1.1. Description

This module provides Cate’s data access API.

9.1.2. Technical Requirements

Query data store

Description:

Allow querying registered ECV data stores using a simple function that takes a set of query parameters and returns data source identifiers that can be used to open respective ECV dataset in the Cate.

URD-Source:
  • CCIT-UR-DM0006: Data access to ESA CCI
  • CCIT-UR-DM0010: The data module shall have the means to attain meta-level status information per ECV type
  • CCIT-UR-DM0013: The CCI Toolbox shall allow filtering

Add data store

Description:

Allow adding of user defined data stores specifying the access protocol and the layout of the data. These data stores can be used to access datasets.

URD-Source:
  • CCIT-UR-DM0011: Data access to non-CCI data

Open dataset

Description:

Allow opening an ECV dataset given an identifier returned by the data store query. The dataset returned complies to the Cate common data model. The dataset to be returned can optionally be constrained in time and space.

URD-Source:
  • CCIT-UR-DM0001: Data access and input
  • CCIT-UR-DM0004: Open multiple inputs
  • CCIT-UR-DM0005: Data access using different protocols>
  • CCIT-UR-DM0007: Open single ECV
  • CCIT-UR-DM0008: Open multiple ECV
  • CCIT-UR-DM0009: Open any ECV
  • CCIT-UR-DM0012: Open different formats

9.1.3. Verification

The module’s unit-tests are located in test/test_ds.py and may be executed using $ py.test test/test_ds.py --cov=cate/core/ds.py for extra code coverage information.

9.1.4. Components

cate.core.ds.DATA_STORE_REGISTRY = {'esa_cci_odp': EsaCciOdpDataStore (esa_cci_odp), 'local': LocalFilePatternDataStore('local')}

The data data store registry of type DataStoreRegistry. Use it add new data stores to Cate.

exception cate.core.ds.DataAccessError(source, cause, *args, **kwargs)[source]

Exceptions produced by Cate’s data stores and data sources instances, used to report any problems handling data.

exception cate.core.ds.DataAccessWarning[source]

Warnings produced by Cate’s data stores and data sources instances, used to report any problems handling data.

class cate.core.ds.DataSource[source]

An abstract data source from which datasets can be retrieved.

cache_info

Return information about cached, locally available data sets. The returned dict, if any, is JSON-serializable.

data_store

The data store to which this data source belongs.

id

Data source identifier.

info_string

Return a textual representation of the meta-information about this data source. Useful for CLI / REPL applications.

make_local(local_name: str, local_id: str = None, time_range: typing.Union[typing.Tuple[str, str], typing.Tuple[datetime.datetime, datetime.datetime], typing.Tuple[datetime.date, datetime.date], str] = None, region: typing.Union[<Mock name='mock.Polygon' id='140631459116704'>, typing.List[typing.Tuple[float, float]], str, typing.Tuple[float, float, float, float]] = None, var_names: typing.Union[typing.List[str], str] = None, monitor: cate.util.monitor.Monitor = Monitor.NONE) → typing.Union[_ForwardRef('DataSource'), NoneType][source]

Turns this (likely remote) data source into a local data source given a name and a number of optional constraints.

If this is a remote data source, data will be downloaded and turned into a local data source which will be added to the data store named “local”.

If this is already a local data source, a new local data source will be created by copying required data or data subsets.

The method returns the newly create local data source.

Parameters:
  • local_name (str) – A human readable name for the new local data source.
  • local_id (str) – A unique ID to be used for the new local data source. If not given, a new ID will be generated.
  • time_range – An optional time constraint comprising start and end date. If given, it must be a TimeRangeLike.
  • region – An optional region constraint. If given, it must be a PolygonLike.
  • var_names – Optional names of variables to be included. If given, it must be a VarNamesLike.
  • monitor (Monitor) – a progress monitor.
Returns:

the new local data source

matches(ds_id: str = None, query_expr: str = None) → bool[source]

Test if this data source matches the given id or query_expr. If neither id nor query_expr are given, the method returns True.

Return type:

bool

Parameters:
  • ds_id (str) – A data source identifier.
  • query_expr (str) – A query expression. Currently, only simple search strings are supported.
Returns:

True, if this data sources matches the given id or query_expr.

meta_info

Return meta-information about this data source. The returned dict, if any, is JSON-serializable.

open_dataset(time_range: typing.Union[typing.Tuple[str, str], typing.Tuple[datetime.datetime, datetime.datetime], typing.Tuple[datetime.date, datetime.date], str] = None, region: typing.Union[<Mock name='mock.Polygon' id='140631459116704'>, typing.List[typing.Tuple[float, float]], str, typing.Tuple[float, float, float, float]] = None, var_names: typing.Union[typing.List[str], str] = None, protocol: str = None) → typing.Any[source]

Open a dataset from this data source.

Parameters:
  • time_range – An optional time constraint comprising start and end date. If given, it must be a TimeRangeLike.
  • region – An optional region constraint. If given, it must be a PolygonLike.
  • var_names – Optional names of variables to be included. If given, it must be a VarNamesLike.
  • protocol (str) – Deprecated. Protocol name, if None selected default protocol will be used to access data.
Returns:

A dataset instance or None if no data is available for the given constraints.

schema

The data Schema for any dataset provided by this data source or None if unknown. Currently unused in cate.

status

Return information about data source accessibility

temporal_coverage(monitor: cate.util.monitor.Monitor = Monitor.NONE) → typing.Union[typing.Tuple[datetime.datetime, datetime.datetime], NoneType][source]

The temporal coverage as tuple (start, end) where start and end are UTC datetime instances.

Parameters:monitor (Monitor) – a progress monitor.
Returns:A tuple of (start, end) UTC datetime instances or None if the temporal coverage is unknown.
title

Human-readable data source title. The default implementation tries to retrieve the title from meta_info['title'].

variables_info

Return meta-information about the variables contained in this data source. The returned dict, if any, is JSON-serializable.

class cate.core.ds.DataSourceStatus[source]
Enum stating current state of Data Source accessibility.
  • READY - data is complete and ready to use
  • ERROR - data initialization process has been interrupted, causing that data source is incomplete or/and corrupted
  • PROCESSING - data source initialization process is in progress.
  • CANCELLED - data initialization process has been intentionally interrupted by user
class cate.core.ds.DataStore(ds_id: str, title: str = None, is_local: bool = False)[source]

Represents a data store of data sources.

Parameters:
  • ds_id – Unique data store identifier.
  • title – A human-readable tile.
id

Return the unique identifier for this data store.

is_local

Whether this is a remote data source not requiring any internet connection when its query() method is called or the open_dataset() and make_local() methods on one of its data sources.

query(ds_id: str = None, query_expr: str = None, monitor: cate.util.monitor.Monitor = Monitor.NONE) → typing.Sequence[cate.core.ds.DataSource][source]

Retrieve data sources in this data store using the given constraints.

Return type:

Sequence

Parameters:
  • ds_id (str) – Data source identifier.
  • query_expr (str) – Query expression which may be used if ìd is unknown.
  • monitor (Monitor) – A progress monitor.
Returns:

Sequence of data sources.

title

Return a human-readable tile for this data store.

class cate.core.ds.DataStoreRegistry[source]

Registry of DataStore objects.

cate.core.ds.find_data_sources(data_stores: typing.Union[cate.core.ds.DataStore, typing.Sequence[cate.core.ds.DataStore]] = None, ds_id: str = None, query_expr: str = None) → typing.Sequence[cate.core.ds.DataSource][source]

Find data sources in the given data store(s) matching the given id or query_expr.

See also open_dataset().

Return type:

Sequence

Parameters:
  • data_stores – If given these data stores will be queried. Otherwise all registered data stores will be used.
  • ds_id (str) – A data source identifier.
  • query_expr (str) – A query expression.
Returns:

All data sources matching the given constrains.

cate.core.ds.format_cached_datasets_coverage_string(cache_coverage: dict) → str[source]

Return a textual representation of information about cached, locally available data sets. Useful for CLI / REPL applications. :rtype: str :type cache_coverage: dict :param cache_coverage: :return:

cate.core.ds.format_variables_info_string(variables: dict)[source]

Return some textual information about the variables contained in this data source. Useful for CLI / REPL applications. :type variables: dict :param variables: :return:

cate.core.ds.open_dataset(data_source: typing.Union[cate.core.ds.DataSource, str], time_range: typing.Union[typing.Tuple[str, str], typing.Tuple[datetime.datetime, datetime.datetime], typing.Tuple[datetime.date, datetime.date], str] = None, region: typing.Union[<Mock name='mock.Polygon' id='140631459116704'>, typing.List[typing.Tuple[float, float]], str, typing.Tuple[float, float, float, float]] = None, var_names: typing.Union[typing.List[str], str] = None, force_local: bool = False, local_ds_id: str = None, monitor: cate.util.monitor.Monitor = Monitor.NONE) → typing.Any[source]

Open a dataset from a data source.

Parameters:
  • data_source – A DataSource object or a string. Strings are interpreted as the identifier of an ECV dataset and must not be empty.
  • time_range – An optional time constraint comprising start and end date. If given, it must be a TimeRangeLike.
  • region – An optional region constraint. If given, it must be a PolygonLike.
  • var_names – Optional names of variables to be included. If given, it must be a VarNamesLike.
  • force_local (bool) – Optional flag for remote data sources only Whether to make a local copy of data source if it’s not present
  • local_ds_id (str) – Optional, fpr remote data sources only Local data source ID for newly created copy of remote data source
  • monitor (Monitor) – A progress monitor
Returns:

An new dataset instance

cate.core.ds.open_xarray_dataset(paths, concat_dim='time', **kwargs) → <Mock name='mock.Dataset' id='140631458386888'>[source]

Open multiple files as a single dataset. This uses dask. If each individual file of the dataset is small, one dask chunk will coincide with one temporal slice, e.g. the whole array in the file. Otherwise smaller dask chunks will be used to split the dataset.

Parameters:
  • paths – Either a string glob in the form “path/to/my/files/*.nc” or an explicit list of files to open.
  • concat_dim (str) – Dimension to concatenate files along. You only need to provide this argument if the dimension along which you want to concatenate is not a dimension in the original datasets, e.g., if you want to stack a collection of 2D arrays along a third dimension.
  • kwargs – Keyword arguments directly passed to xarray.open_mfdataset()

9.2. Module cate.core.op

9.2.1. Description

This modules provides classes and functions allowing to maintain operations. Operations can be called from the Cate command-line interface, may be referenced from within processing workflows, or may be called remotely e.g. from graphical user interface or web frontend. An operation (Operation) comprises a Python callable and some additional meta-information (OpMetaInfo) that allows for automatic input validation, input value conversion, monitoring, and inter-connection of multiple operations using processing workflows and steps.

Operations are registered in operation registries (OpRegistry), the default operation registry is accessible via the global, read-only OP_REGISTRY variable.

9.2.2. Technical Requirements

Operation registration, lookup, and invocation

Description:

Maintain a central place in the software that manages the available operations such as data processors, data converters, analysis functions, etc. Operations can be added, removed and retrieved. Operations are designed to be executed by the framework in a controlled way, i.e. an operation’s task can be monitored and cancelled, it’s input and out values can be validated w.r.t. the operation’s meta-information.

URD-Sources:
  • CCIT-UR-CR0001: Extensibility.
  • CCIT-UR-E0002: dynamic extension of all modules at runtime, c) The Logic Module to introduce new processors
  • CCIT-UR-LM0001: processor management allowing easy selection of tools and functionalities

Exploit Python language features

Description:Exploit Python language to let API users express an operation in an intuitive form. For the framework API, stay with Python base types as far as possible instead of introducing a number of new data structures. Let the framework derive meta information such as names, types and documentation for the operation, its inputs, and its outputs from the user’s Python code. It shall be possible to register any Python-callable of the from f(*args, **kwargs) as an operation.

Add extra meta-information to operations

Description:

Initial operation meta-information will be derived from Python code introspection. It shall include the user function’s docstring and information about the arguments an its return values, exploiting any type annotations. For example, the following properties can be associated with input arguments: data type, default value, value set, valid range, if it is mandatory or optional, expected dataset schema so that operations can be ECV-specific. Meta-information is required to let an operation explain itself when used in a (IPython) REPL or when web service is requested to respond with an operations’s capabilities. API users shall be able to extend the initial meta-information derived from Python code.

URD-Source:
  • CCIT-UR-LM0006: offer default values for lower level users as well as selectable options for higher level users.
  • CCIT-UR-LM0002: accommodating ECV-specific processors in cases where the processing is specific to an ECV.

Static annotation vs. dynamic, programmatic registration

Description:Operation registration and meta-information extension shall also be done by operation class / function decorators. The API shall provide a simple set of dedicated decorators that API user’s attach to their operations. They will automatically register the user function as operation and add any extra meta-information.

Operation monitoring

Description:Operation registration should recognise an optional monitor argument of a user function: f(*args, monitor=Monitor.NONE, **kwargs). In this case the a monitor (of type Monitor) will be passed by the framework to the user function in order to observe the progress and to cancel an operation.

9.2.3. Verification

The module’s unit-tests are located in test/test_op.py and may be executed using $ py.test test/test_op.py --cov=cate/core/plugin.py for extra code coverage information.

9.2.4. Components

cate.core.op.OP_REGISTRY = OP_REGISTRY

The default operation registry of type cate.core.op.OpRegistry.

class cate.core.op.OpRegistry[source]

An operation registry allows for addition, removal, and retrieval of operations.

add_op(operation: typing.Callable, fail_if_exists=True, replace_if_exists=False) → cate.core.op.Operation[source]

Add a new operation registration.

Return type:

Operation

Parameters:
  • operation (Callable) – A operation object such as a class or any callable.
  • fail_if_exists (bool) – raise ValueError if the operation was already registered
  • replace_if_exists (bool) – replaces an existing operation if fail_if_exists is False
Returns:

a new or existing cate.core.op.Operation

get_op(operation, fail_if_not_exists=False) → cate.core.op.Operation[source]

Get an operation registration.

Return type:

Operation

Parameters:
  • operation – A fully qualified operation name or operation object such as a class or any callable.
  • fail_if_not_exists (bool) – raise ValueError if no such operation was found
Returns:

a cate.core.op.Operation object or None if fail_if_not_exists is False.

get_op_key(operation: typing.Union[str, typing.Callable])[source]

Get a key under which the given operation will be registered.

Parameters:operation – A fully qualified operation name or a callable object
Returns:The operation key
op_registrations

Get all operation registrations of type cate.core.op.Operation.

Returns:a mapping of fully qualified operation names to operation registrations
remove_op(operation: typing.Callable, fail_if_not_exists=False) → typing.Union[cate.core.op.Operation, NoneType][source]

Remove an operation registration.

Parameters:
  • operation (Callable) – A fully qualified operation name or operation object such as a class or any callable.
  • fail_if_not_exists (bool) – raise ValueError if no such operation was found
Returns:

the removed cate.core.op.Operation object or None if fail_if_not_exists is False.

class cate.core.op.Operation(wrapped_op: typing.Callable, op_meta_info=None)[source]

An Operation comprises a wrapped callable (e.g. function, constructor, lambda form) and additional meta-information about the wrapped operation itself and its inputs and outputs.

Parameters:
  • wrapped_op – some callable object that will be wrapped.
  • op_meta_info – operation meta information.
op_meta_info
Returns:Meta-information about the operation, see cate.core.op.OpMetaInfo.
wrapped_op
Returns:The actual operation object which may be any callable.
cate.core.op.new_expression_op(op_meta_info: cate.util.opmetainf.OpMetaInfo, expression: str) → cate.core.op.Operation[source]

Create an operation that wraps a Python expression.

Return type:

Operation

Parameters:
  • op_meta_info (OpMetaInfo) – Meta-information about the resulting operation and the operation’s inputs and outputs.
  • expression (str) – The Python expression. May refer to any name given in op_meta_info.input.
Returns:

The Python expression wrapped into an operation.

cate.core.op.new_subprocess_op(op_meta_info: cate.util.opmetainf.OpMetaInfo, command_pattern: str, run_python: bool = False, cwd: typing.Union[str, NoneType] = None, env: typing.Dict[str, str] = None, shell: bool = False, started: typing.Union[str, typing.Callable] = None, progress: typing.Union[str, typing.Callable] = None, done: typing.Union[str, typing.Callable] = None) → cate.core.op.Operation[source]

Create an operation for a child program run in a new process.

Return type:

Operation

Parameters:
  • op_meta_info (OpMetaInfo) – Meta-information about the resulting operation and the operation’s inputs and outputs.
  • command_pattern (str) – A pattern that will be interpolated to obtain the actual command to be executed. May contain “{input_name}” fields which will be replaced by the actual input value converted to text. input_name must refer to a valid operation input name in op_meta_info.input or it must be the value of either the “write_to” or “read_from” property of another input’s property map.
  • run_python (bool) – If True, command_pattern refers to a Python script which will be executed with the Python interpreter that Cate uses.
  • cwd – Current working directory to run the command line in.
  • env (Dict) – Environment variables passed to the shell that executes the command line.
  • shell (bool) – Whether to use the shell as the program to execute.
  • started – Either a callable that receives a text line from the executable’s stdout and returns a tuple (label, total_work) or a regex that must match in order to signal the start of progress monitoring. The regex must provide the group names “label” or “total_work” or both, e.g. “(?P<label>w+)” or “(?P<total_work>d+)”
  • progress – Either a callable that receives a text line from the executable’s stdout and returns a tuple (work, msg) or a regex that must match in order to signal process. The regex must provide group names “work” or “msg” or both, e.g. “(?P<msg>w+)” or “(?P<work>d+)”
  • done – Either a callable that receives a text line a text line from the executable’s stdout and returns True or False or a regex that must match in order to signal the end of progress monitoring.
Returns:

The executable wrapped into an operation.

cate.core.op.op(tags=UNDEFINED, version=UNDEFINED, res_pattern=UNDEFINED, deprecated=UNDEFINED, registry=OP_REGISTRY, **properties)[source]

op is a decorator function that registers a Python function or class in the default operation registry or the one given by registry, if any. Any other keywords arguments in header are added to the operation’s meta-information header. Classes annotated by this decorator must have callable instances.

When a function is registered, an introspection is performed. During this process, initial operation the meta-information header property description is derived from the function’s docstring.

If any output of this operation will have its history information automatically updated, there should be version information found in the operation header. Thus it’s always a good idea to add it to all operations:

@op(version='X.x')
Parameters:
  • tags – An optional list of string tags.
  • version – An optional version string.
  • res_pattern – An optional pattern that will be used to generate the names for data resources that are used to hold a reference to the objects returned by the operation and that are cached in a Cate workspace. Currently, the only pattern variable that is supported and that must be present is {index} which will be replaced by an integer number that is guaranteed to produce a unique resource name.
  • deprecated – An optional boolean or a string. If a string is used, it should explain why the operation has been deprecated and which new operation to use instead. If set to True, the operation’s doc-string should explain the deprecation.
  • registry – The operation registry.
  • properties – Other properties (keyword arguments) that will be added to the meta-information of operation.
cate.core.op.op_input(input_name: str, default_value=UNDEFINED, units=UNDEFINED, data_type=UNDEFINED, nullable=UNDEFINED, value_set_source=UNDEFINED, value_set=UNDEFINED, value_range=UNDEFINED, deprecated=UNDEFINED, position=UNDEFINED, context=UNDEFINED, registry=OP_REGISTRY, **properties)[source]

op_input is a decorator function that provides meta-information for an operation input identified by input_name. If the decorated function or class is not registered as an operation yet, it is added to the default operation registry or the one given by registry, if any.

When a function is registered, an introspection is performed. During this process, initial operation meta-information input properties are derived for each positional and keyword argument named input_name:

Derived property Source
position The position of a positional argument, e.g. 2 for input z in def f(x, y, z, c=2).
default_value The value of a keyword argument, e.g. 52.3 for input latitude from argument definition latitude:float=52.3
data_type The type annotation type, e.g. float for input latitude from argument definition latitude:float

The derived properties listed above plus any of value_set, value_range, and any key-value pairs in properties are added to the input’s meta-information. A key-value pair in properties will always overwrite the derived properties listed above.

Parameters:
  • input_name (str) – The name of an input.
  • default_value – A default value.
  • units – The geo-physical units of the input value.
  • data_type – The data type of the input values. If not given, the type of any given, non-None default_value is used.
  • nullable – If True, the value of the input may be None. If not given, it will be set to True if the default_value is None.
  • value_set_source – The name of an input, which can be used to generate a dynamic value set.
  • value_set – A sequence of the valid values. Note that all values in this sequence must be compatible with data_type.
  • value_range – A sequence specifying the possible range of valid values.
  • deprecated – An optional boolean or a string. If a string is used, it should explain why the input has been deprecated and which new input to use instead. If set to True, the input’s doc-string should explain the deprecation.
  • position – The zero-based position of an input.
  • context – If True, the value of the operation input will be a dictionary representing the current execution context. For example, when the operation is executed from a workflow, the dictionary will hold at least three entries: workflow provides the current workflow, step is the currently executed step, and value_cache which is a mapping from step identifiers to step outputs. If context is a string, the value of the operation input will be the result of evaluating the string as Python expression with the current execution context as local environment. This means, context may be an expression such as ‘workspace’, ‘workspace.base_dir’, ‘step’, ‘step.id’.
  • properties – Other properties (keyword arguments) that will be added to the meta-information of the named output.
  • registry – Optional operation registry.
cate.core.op.op_output(output_name: str, data_type=UNDEFINED, deprecated=UNDEFINED, registry=OP_REGISTRY, **properties)[source]

op_output is a decorator function that provides meta-information for an operation output identified by output_name. If the decorated function or class is not registered as an operation yet, it is added to the default operation registry or the one given by registry, if any.

If your function does not return multiple named outputs, use the op_return() decorator function. Note that:

@op_return(...)
def my_func(...):
    ...

if equivalent to:

@op_output('return', ...)
def my_func(...):
    ...

To automatically add information about cate, its version, this operation and its inputs, to this output, set ‘add_history’ to True:

@op_output('name', add_history=True)

Note that the operation should have version information added to it when add_history is True:

@op(version='X.x')
Parameters:
  • output_name (str) – The name of the output.
  • data_type – The data type of the output value.
  • deprecated – An optional boolean or a string. If a string is used, it should explain why the output has been deprecated and which new output to use instead. If set to True, the output’s doc-string should explain the deprecation.
  • properties – Other properties (keyword arguments) that will be added to the meta-information of the named output.
  • registry – Optional operation registry.
cate.core.op.op_return(data_type=UNDEFINED, registry=OP_REGISTRY, **properties)[source]

op_return is a decorator function that provides meta-information for a single, anonymous operation return value (whose output name is "return"). If the decorated function or class is not registered as an operation yet, it is added to the default operation registry or the one given by registry, if any. Any other keywords arguments in properties are added to the output’s meta-information.

When a function is registered, an introspection is performed. During this process, initial operation meta-information output properties are derived from the function’s return type annotation, that is data_type will be e.g. float if a function is annotated as def f(x, y) -> float: ....

The derived data_type property and any key-value pairs in properties are added to the output’s meta-information. A key-value pair in properties will always overwrite a derived data_type.

If your function returns multiple named outputs, use the op_output() decorator function. Note that:

@op_return(...)
def my_func(...):
    ...

if equivalent to:

@op_output('return', ...)
def my_func(...):
    ...

To automatically add information about cate, its version, this operation and its inputs, to this output, set ‘add_history’ to True:

@op_return(add_history=True)

Note that the operation should have version information added to it when add_history is True:

@op(version='X.x')
Parameters:
  • data_type – The data type of the return value.
  • properties – Other properties (keyword arguments) that will be added to the meta-information of the return value.
  • registry – The operation registry.

9.3. Module cate.core.workflow

9.3.1. Description

Provides classes that are used to construct processing workflows (networks, directed acyclic graphs) from processing steps including Python callables, Python expressions, external processes, and other workflows.

This module provides the following data types:

  • A Node has zero or more inputs and zero or more outputs and can be invoked
  • A Workflow is a Node that is composed of Step objects
  • A Step is a Node that is part of a Workflow and performs some kind of data processing.
  • A OpStep is a Step that invokes a Python operation (any callable).
  • A ExpressionStep is a Step that executes a Python expression string.
  • A WorkflowStep is a Step that executes a Workflow loaded from an external (JSON) resource.
  • A NodePort belongs to exactly one Node. Node ports represent both the named inputs and outputs of node. A node port has a name, a property source, and a property value. If source is set, it must be another NodePort that provides the actual port’s value. The value of the value property can be basically anything that has an external (JSON) representation.

Workflow input ports are usually unspecified, but value may be set. Workflow output ports and a step’s input ports are usually connected with output ports of other contained steps or inputs of the workflow via the source attribute. A step’s output ports are usually unconnected because their value attribute is set by a step’s concrete implementation.

Step node inputs and workflow outputs are indicated in the input specification of a node’s external JSON representation:

  • {"source": "NODE_ID.PORT_NAME" }: the output (or input) named PORT_NAME of another node given by NODE_ID.
  • {"source": ".PORT_NAME" }: current step’s output (or input) named PORT_NAME or of any of its parents.
  • {"source": "NODE_ID" }: the one and only output of a workflow or of one of its nodes given by NODE_ID.
  • {"value": NUM|STR|LIST|DICT|null }: a constant (JSON) value.

Workflows are callable by the CLI in the same way as single operations. The command line form for calling an operation is currently::

cate run OP|WORKFLOW [ARGS]

Where OP is a registered operation and WORKFLOW is a JSON file containing a JSON workflow representation.

9.3.2. Technical Requirements

Combine processors and other operations to create operation chains or processing graphs

Description:

Provide the means to connect multiple processing steps, which may be registered operations, operating system calls, remote service invocations.

URD-Sources:
  • CCIT-UR-LM0001: processor management allowing easy selection of tools and functionalities.
  • CCIT-UR-LM0003: easy construction of graphs without any knowledge of a programming language (Graph Builder).
  • CCIT-UR-LM0004: selection of a number of predefined standard processing chains.
  • CCIT-UR-LM0005: means to configure a processor chain comprised of one processor only from the library to execute on data from the Common Data Model.

Integration of external, ECV-specific programs

Description:

Some processing step might only be solved by executing an external tool. Therefore, a special workflow step shall allow for invocation of external programs hereby mapping input values to program arguments, and program outputs to step outputs. It shall also be possible to monitor the state of the running sub-process.

URD-Source:
  • CCIT-UR-LM0002: accommodating ECV-specific processors in cases where the processing is specific to an ECV.

Programming language neutral representation

Description:

Processing graphs must be representable in a programming language neutral representation such as XML, JSON, YAML, so they can be designed by non-programmers and can be easily serialised, e.g. for communication with a web service.

URD-Source:
  • CCIT-UR-LM0003: easy construction of graphs without any knowledge of a programming language
  • CCIT-UR-CL0001: reading and executing script files written in XML or similar

9.3.3. Verification

The module’s unit-tests are located in test/test_workflow.py and may be executed using $ py.test test/test_workflow.py --cov=cate/core/workflow.py for extra code coverage information.

9.3.4. Components

class cate.core.workflow.ExpressionStep(expression: str, inputs=None, outputs=None, node_id=None)[source]

An ExpressionStep is a step node that computes its output from a simple (Python) expression string.

Parameters:
  • expression – A simple (Python) expression string.
  • inputs – input name to input properties mapping.
  • outputs – output name to output properties mapping.
  • node_id – A node ID. If None, an ID will be generated.
class cate.core.workflow.NoOpStep(inputs: dict = None, outputs: dict = None, node_id: str = None)[source]

A NoOpStep “performs” a no-op, which basically means, it does nothing. However, it might still be useful to define step that or duplicates or renames output values by connecting its own output ports with any of its own input ports. In other cases it might be useful to have a NoOpStep as a placeholder or blackbox for some other real operation that will be put into place at a later point in time.

Parameters:
  • inputs – input name to input properties mapping.
  • outputs – output name to output properties mapping.
  • node_id – A node ID. If None, an ID will be generated.
class cate.core.workflow.Node(op_meta_info: cate.util.opmetainf.OpMetaInfo, node_id: str = None)[source]

Base class for all nodes including parent nodes (e.g. Workflow) and child nodes (e.g. Step).

All nodes have inputs and outputs, and can be invoked to perform some operation.

Inputs and outputs are exposed as attributes of the input and output properties and are both of type NodePort.

Parameters:node_id – A node ID. If None, a name will be generated.
call(context: typing.Dict = None, monitor=Monitor.NONE, input_values: typing.Dict = None)[source]

Calls this workflow with given input_values and returns the result.

The method does the following: 1. Set default_value where input values are missing in input_values 2. Validate the input_values using this workflows’s meta-info 3. Set this workflow’s input port values 4. Invoke this workflow with given context and monitor 5. Get this workflow’s output port values. Named outputs will be returned as dictionary.

Parameters:
  • context (Dict) – An optional execution context. It will be used to automatically set the value of any node input which has a “context” property set to either True or a context expression string.
  • monitor – An optional progress monitor.
  • input_values (Dict) – The input values.
Returns:

The output values.

collect_predecessors(predecessors: typing.List[_ForwardRef('Node')], excludes: typing.List[_ForwardRef('Node')] = None)[source]

Collect this node (self) and preceding nodes in predecessors.

find_node(node_id) → typing.Union[_ForwardRef('Node'), NoneType][source]

Find a (child) node with the given node_id.

find_port(name) → typing.Union[_ForwardRef('NodePort'), NoneType][source]

Find port with given name. Output ports are searched first, then input ports. :param name: The port name :return: The port, or None if it couldn’t be found.

id

The node’s identifier.

inputs

The node’s inputs.

invoke(context: typing.Dict = None, monitor: cate.util.monitor.Monitor = Monitor.NONE) → None[source]

Invoke this node’s underlying operation with input values from input. Output values in output will be set from the underlying operation’s return value(s).

Parameters:
  • context (Dict) – An optional execution context.
  • monitor (Monitor) – An optional progress monitor.
max_distance_to(other_node: cate.core.workflow.Node) → int[source]

If other_node is a source of this node, then return the number of connections from this node to node. If it is a direct source return 1, if it is a source of the source of this node return 2, etc. If other_node is this node, return 0. If other_node is not a source of this node, return -1.

Return type:int
Parameters:other_node – The other node.
Returns:The distance to other_node
op_meta_info

The node’s operation meta-information.

outputs

The node’s outputs.

parent_node

The node’s parent node or None if this node has no parent.

requires(other_node: cate.core.workflow.Node) → bool[source]

Does this node require other_node for its computation? Is other_node a source of this node?

Return type:bool
Parameters:other_node – The other node.
Returns:True if this node is a target of other_node
root_node

The root_node node.

set_id(node_id: str) → None[source]

Set the node’s identifier.

Parameters:node_id (str) – The new node identifier. Must be unique within a workflow.
to_json_dict()[source]

Return a JSON-serializable dictionary representation of this object.

Returns:A JSON-serializable dictionary
update_sources()[source]

Resolve unresolved source references in inputs and outputs.

update_sources_node_id(changed_node: cate.core.workflow.Node, old_id: str)[source]

Update the source references of input and output ports from old_id to new_id.

class cate.core.workflow.NodePort(node: cate.core.workflow.Node, name: str)[source]

Represents a named input or output port of a Node.

to_json(force_dict=False)[source]

Return a JSON-serializable dictionary representation of this object.

Returns:A JSON-serializable dictionary
update_source()[source]

Resolve this node port’s source reference, if any.

If the source reference has the form node-id.port-name then node-id must be the ID of the workflow or any contained step and port-name must be a name either of one of its input or output ports.

If the source reference has the form .port-name then node-id will refer to either the current step or any of its parent nodes that contains an input or output named port-name.

If the source reference has the form node-id then node-id must be the ID of the workflow or any contained step which has exactly one output.

If node-id refers to a workflow, then port-name is resolved first against the workflow’s inputs followed by its outputs. If node-id refers to a workflow’s step, then port-name is resolved first against the step’s outputs followed by its inputs.

Raises:ValueError – if the source reference is invalid.
update_source_node_id(node: cate.core.workflow.Node, old_node_id: str) → None[source]

A node identifier has changed so we update the source references and clear the source of input and output ports from old_node_id to node.id.

Parameters:
  • node (Node) – The node whose identifier changed.
  • old_node_id (str) – The former node identifier.
class cate.core.workflow.OpStep(operation, node_id: str = None, registry=OP_REGISTRY)[source]

An OpStep is a step node that invokes a registered operation of type Operation.

Parameters:
  • operation – A fully qualified operation name or operation object such as a class or callable.
  • registry – An operation registry to be used to lookup the operation, if given by name.
  • node_id – A node ID. If None, a unique ID will be generated.
class cate.core.workflow.OpStepBase(op: cate.core.op.Operation, node_id: str = None)[source]

Base class for concrete steps based on an Operation.

Parameters:
  • op – An Operation object.
  • node_id – A node ID. If None, a unique ID will be generated.
op

The operation registration. See cate.core.op.Operation

class cate.core.workflow.SourceRef(node_id, port_name)
node_id

Alias for field number 0

port_name

Alias for field number 1

class cate.core.workflow.Step(op_meta_info: cate.util.opmetainf.OpMetaInfo, node_id: str = None)[source]

A step is an inner node of a workflow.

Parameters:node_id – A node ID. If None, a name will be generated.
enhance_json_dict(node_dict: collections.OrderedDict)[source]

Enhance the given JSON-compatible node_dict by step specific elements.

classmethod new_step_from_json_dict(json_dict, registry=OP_REGISTRY) → typing.Union[_ForwardRef('Step'), NoneType][source]

Create a new step node instance from the given json_dict

parent_node

The node’s ID.

persistent

Return whether this step is persistent. That is, if the current workspace is saved, the result(s) of a persistent step may be written to a “resource” file in the workspace directory using this step’s ID as filename. The file format and filename extension will be chosen according to each result’s data type. On next attempt to execute the step again, e.g. if a workspace is opened, persistent steps may read the “resource” file to produce the result rather than performing an expensive re-computation. :return: True, if so, False otherwise

to_json_dict()[source]

Return a JSON-serializable dictionary representation of this object.

Returns:A JSON-serializable dictionary
class cate.core.workflow.SubProcessStep(command: str, run_python: bool = False, env: typing.Dict[str, str] = None, cwd: str = None, shell: bool = False, started_re: str = None, progress_re: str = None, done_re: str = None, inputs: typing.Dict[str, typing.Dict] = None, outputs: typing.Dict[str, typing.Dict] = None, node_id: str = None)[source]

A SubProcessStep is a step node that computes its output by a sub-process created from the given program.

Parameters:
  • command – A pattern that will be interpolated by input values to obtain the actual command (program with arguments) to be executed. May contain “{input_name}” fields which will be replaced by the actual input value converted to text. input_name must refer to a valid operation input name in op_meta_info.input or it must be the value of either the “write_to” or “read_from” property of another input’s property map.
  • run_python – If True, command_line_pattern refers to a Python script which will be executed with the Python interpreter that Cate uses.
  • cwd – Current working directory to run the command line in.
  • env – Environment variables passed to the shell that executes the command line.
  • shell – Whether to use the shell as the program to execute.
  • started_re – A regex that must match a text line from the process’ stdout in order to signal the start of progress monitoring. The regex must provide the group names “label” or “total_work” or both, e.g. “(?P<label>w+)” or “(?P<total_work>d+)”
  • progress_re – A regex that must match a text line from the process’ stdout in order to signal process. The regex must provide group names “work” or “msg” or both, e.g. “(?P<msg>w+)” or “(?P<work>d+)”
  • done_re – A regex that must match a text line from the process’ stdout in order to signal the end of progress monitoring.
  • inputs – input name to input properties mapping.
  • outputs – output name to output properties mapping.
  • node_id – A node ID. If None, an ID will be generated.
class cate.core.workflow.ValueCache[source]

ValueCache is a closable dictionary that maintains unique IDs for it’s keys. If a ValueCache is closed, all closable values are also closed. A value is closeable if it has a close attribute whose value is a callable.

child(key: str) → cate.core.workflow.ValueCache[source]

Return the child ValueCache for given key.

clear() → None[source]

Override the dict method to closes values and remove all IDs.

close() → None[source]

Close all values and remove all IDs.

get_id(key: str)[source]

Return the integer ID for given key or None.

get_key(id: int)[source]

Return the key for given integer id or None.

get_update_count(key: str)[source]

Return the integer update count for given key or None.

get_value_by_id(id: int, default=UNDEFINED)[source]

Return the value for the given integer id or return default.

pop(key, default=None)[source]

Override the dict method to close the value and remove its ID.

rename_key(key: str, new_key: str) → None[source]

Rename the given key into new_key without changing the value of the ID.

Parameters:
  • key (str) – The old key.
  • new_key (str) – The new key.
cate.core.workflow.WORKFLOW_SCHEMA_VERSION = 1

Version number of Workflow JSON schema. Will be incremented with the first schema change after public release.

class cate.core.workflow.Workflow(op_meta_info: cate.util.opmetainf.OpMetaInfo, node_id: str = None)[source]

A workflow of (connected) steps.

Parameters:
  • op_meta_info – Meta-information object of type OpMetaInfo.
  • node_id – A node ID. If None, an ID will be generated.
find_steps_to_compute(step_id: str) → typing.List[_ForwardRef('Step')][source]

Compute the list of steps required to compute the output of the step with the given step_id. The order of the returned list is its execution order, with the step given by step_id is the last one.

Return type:List
Parameters:step_id (str) – The step to be computed last and whose output value is requested.
Returns:a list of steps, which is never empty
invoke_steps(steps: typing.List[_ForwardRef('Step')], context: typing.Dict = None, monitor_label: str = None, monitor=Monitor.NONE) → None[source]

Invoke just the given steps.

Parameters:
  • steps (List) – Selected steps of this workflow.
  • context (Dict) – An optional execution context
  • monitor_label (str) – An optional label for the progress monitor.
  • monitor – The progress monitor.
classmethod load(file_path_or_fp: typing.Union[str, io.IOBase], registry=OP_REGISTRY) → cate.core.workflow.Workflow[source]

Load a workflow from a file or file pointer. The format is expected to be “Workflow JSON”.

Parameters:
  • file_path_or_fp – file path or file pointer
  • registry – Operation registry
Returns:

a workflow

remove_orphaned_sources(removed_node: cate.core.workflow.Node)[source]

Remove all input/output ports, whose source is still referring to removed_node. :type removed_node: Node :param removed_node: A removed node.

classmethod sort_steps(steps: typing.List[_ForwardRef('Step')])[source]

Sorts the list of workflow steps in the order they they can be executed.

sorted_steps

The workflow steps in the order they they can be executed.

steps

The workflow steps in the order they where added.

store(file_path_or_fp: typing.Union[str, io.IOBase]) → None[source]

Store a workflow to a file or file pointer. The format is “Workflow JSON”.

Parameters:file_path_or_fp – file path or file pointer
to_json_dict() → dict[source]

Return a JSON-serializable dictionary representation of this object.

Returns:A JSON-serializable dictionary
update_sources() → None[source]

Resolve unresolved source references in inputs and outputs.

update_sources_node_id(changed_node: cate.core.workflow.Node, old_id: str)[source]

Update the source references of input and output ports from old_id to new_id.

class cate.core.workflow.WorkflowStep(workflow: cate.core.workflow.Workflow, resource: str, node_id: str = None)[source]

A WorkflowStep is a step node that invokes an externally stored Workflow.

Parameters:
  • workflow – The referenced workflow.
  • resource – A resource (e.g. file path, URL) from which the workflow was loaded.
  • node_id – A node ID. If None, an ID will be generated.
resource

The workflow’s resource path (file path, URL).

workflow

The workflow.

cate.core.workflow.new_workflow_op(workflow_or_path: typing.Union[str, cate.core.workflow.Workflow]) → cate.core.op.Operation[source]

Create an operation from a workflow read from the given path.

Return type:Operation
Parameters:workflow_or_path – Either a path to Workflow JSON file or Workflow object.
Returns:The workflow operation.

9.4. Module cate.core.plugin

9.4.1. Description

The cate.core.plugin module exposes the Cate’s plugin REGISTRY which is mapping from Cate entry point names to plugin meta information. An Cate plugin is any callable in an internal/extension module registered with cate_plugins entry point.

Clients register a Cate plugin in the setup() call of their setup.py script. The following plugin example comprises a main module cate_wavelet_gapfill which provides the entry point function cate_init::

setup(
    name="cate-gapfill-wavelet",
    version="0.5",
    description='A wavelet-based gap-filling algorithm for the ESA CCI Toolbox',
    license='GPL 3',
    author='John Doe',
    packages=['cate_wavelet_gapfill'],
    entry_points={
        'cate_plugins': [
            'cate_wavelet_gapfill = cate_wavelet_gapfill:cate_init',
        ],
    },
    install_requires=['pywavelets >= 2.1'],
)

The entry point callable should have the following signature:

def cate_init(*args, **kwargs):
    pass

or:

class EctInit:
    def __init__(*args, **kwargs)__:
        pass

The return values are ignored.

9.4.2. Verification

The module’s unit-tests are located in test/test_plugin.py and may be executed using $ py.test test/test_plugin.py --cov=cate/core/plugin.py for extra code coverage information.

9.4.3. Components

cate.core.plugin.PLUGIN_REGISTRY = OrderedDict([('cate_ds', {'entry_point': 'cate_ds'}), ('cate_ops', {'entry_point': 'cate_ops'})])

Mapping of Cate entry point names to JSON-serializable plugin meta-information.

cate.core.plugin.cate_init(*arg, **kwargs)[source]

No actual use, just demonstrates the signature of an Cate entry point callable.

Parameters:
  • arg – any arguments (not used)
  • kwargs – any keyword arguments (not used)
Returns:

any or void (not used)

9.5. Module cate.conf

9.6. Module cate.ds

9.6.1. Description

The ds package comprises all specific data source implementations.

This is a plugin package automatically imported by the installation script’s entry point cate_ds (see the projects setup.py file).

9.6.2. Verification

The module’s unit-tests are located in test/ds and may be executed using $ py.test test/ops/test_<MODULE>.py --cov=cate/ops/<MODULE>.py for extra code coverage information.

9.6.3. Components

9.7. Module cate.ops

9.7.1. Description

The ops package comprises all specific operation and processor implementations.

This is a plugin package automatically imported by the installation script’s entry point cate_ops (see the projects setup.py file).

9.7.2. Verification

The module’s unit-tests are located in test/ops and may be executed using $ py.test test/ops/test_<MODULE>.py --cov=cate/ops/<MODULE>.py for extra code coverage information.

9.7.3. Functions

cate.ops.resample_2d(src, w, h, ds_method=54, us_method=11, fill_value=None, mode_rank=1, out=None)[source]

Resample a 2-D grid to a new resolution.

Parameters:
  • src – 2-D ndarray
  • wint New grid width
  • hint New grid height
  • ds_method (int) – one of the DS_ constants, optional Grid cell aggregation method for a possible downsampling
  • us_method (int) – one of the US_ constants, optional Grid cell interpolation method for a possible upsampling
  • fill_valuescalar, optional If None, it is taken from src if it is a masked array, otherwise from out if it is a masked array, otherwise numpy’s default value is used.
  • mode_rank (int) – scalar, optional The rank of the frequency determined by the ds_method DS_MODE. One (the default) means most frequent value, zwo means second most frequent value, and so forth.
  • out – 2-D ndarray, optional Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output.
Returns:

An resampled version of the src array.

cate.ops.downsample_2d(src, w, h, method=54, fill_value=None, mode_rank=1, out=None)[source]

Downsample a 2-D grid to a lower resolution by aggregating original grid cells.

Parameters:
  • src – 2-D ndarray
  • wint Grid width, which must be less than or equal to src.shape[-1]
  • hint Grid height, which must be less than or equal to src.shape[-2]
  • method (int) – one of the DS_ constants, optional Grid cell aggregation method
  • fill_valuescalar, optional If None, it is taken from src if it is a masked array, otherwise from out if it is a masked array, otherwise numpy’s default value is used.
  • mode_rank (int) – scalar, optional The rank of the frequency determined by the method DS_MODE. One (the default) means most frequent value, zwo means second most frequent value, and so forth.
  • out – 2-D ndarray, optional Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output.
Returns:

A downsampled version of the src array.

cate.ops.upsample_2d(src, w, h, method=11, fill_value=None, out=None)[source]

Upsample a 2-D grid to a higher resolution by interpolating original grid cells.

Parameters:
  • src – 2-D ndarray
  • wint Grid width, which must be greater than or equal to src.shape[-1]
  • hint Grid height, which must be greater than or equal to src.shape[-2]
  • method (int) – one of the US_ constants, optional Grid cell interpolation method
  • fill_valuescalar, optional If None, it is taken from src if it is a masked array, otherwise from out if it is a masked array, otherwise numpy’s default value is used.
  • out – 2-D ndarray, optional Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output.
Returns:

An upsampled version of the src array.

9.8. Module cate.cli.main

9.8.1. Description

This module provides Cate’s CLI executable.

To use the CLI executable, invoke the module file as a script, type python3 cate/cli/main.py [ARGS] [OPTIONS]. Type python3 cate/cli/main.py –help` for usage help.

The CLI operates on sub-commands. New sub-commands can be added by inheriting from the Command class and extending the Command.REGISTRY list of known command classes.

9.8.2. Technical Requirements

Extensible CLI with multiple sub-commands

Description:

The CCI Toolbox should only have a single CLI executable that comes with multiple sub-commands instead of maintaining a number of different executables for each purpose. Plugins shall be able to add new CLI sub-commands.

URD-Source:
  • CCIT-UR-CR0001: Extensibility.
  • CCIT-UR-A0002: Offer a Command Line Interface (CLI).

Run operations and workflows

Description:

Allow for executing registered operations an workflows composed of operations.

URD-Source:
  • CCIT-UR-CL0001: Reading and executing script files written in XML or similar

List available data, operations and extensions

Description:

Allow for listing dynamic content including available data, operations and plugin extensions.

URD-Source:
  • CCIT-UR-E0001: Dynamic extension by the use of plug-ins

Display information about available climate data sources

Description:

Before downloading ECV datasets to the local computer, users shall be able to display information about them, e.g. included variables, total size, spatial and temporal resolution.

URD-Source:
  • CCIT-UR-DM0009: Holding information of any CCI ECV type
  • CCIT-UR-DM0010: Attain meta-level status information per ECV type

Synchronize locally cached climate data

Description:

Allow for listing dynamic content including available data, operations and plugin extensions.

URD-Source:
  • CCIT-UR-DM0006: Access to and ingestion of ESA CCI datasets

9.8.3. Verification

The module’s unit-tests are located in test/cli/test_main.py and may be executed using $ py.test test/cli/test_main.py --cov=cate/cli/test_main.py for extra code coverage information.

9.8.4. Components

cate.cli.main.CLI_NAME = 'cate'

Name of the Cate CLI executable (= cate).

cate.cli.main.COMMAND_REGISTRY = [<class 'cate.cli.main.DataSourceCommand'>, <class 'cate.cli.main.OperationCommand'>, <class 'cate.cli.main.WorkspaceCommand'>, <class 'cate.cli.main.ResourceCommand'>, <class 'cate.cli.main.RunCommand'>]

List of sub-commands supported by the CLI. Entries are classes derived from Command class. Cate plugins may extend this list by their commands during plugin initialisation.

class cate.cli.main.DataSourceCommand[source]

The ds command implements various operations w.r.t. datasets.

class cate.cli.main.OperationCommand[source]

The op command implements various operations w.r.t. operations.

class cate.cli.main.PluginCommand[source]

The pi command lists the content of various plugin registry.

class cate.cli.main.ResourceCommand[source]

The res command implements various operations w.r.t. workspaces.

class cate.cli.main.RunCommand[source]

The run command is used to invoke registered operations and JSON workflows.

class cate.cli.main.WorkspaceCommand[source]

The ws command implements various operations w.r.t. workspaces.

9.9. Module cate.webapi

9.10. Module cate.util

9.11. Module cate.util.cache

class cate.util.cache.Cache(store=<cate.util.cache.MemoryCacheStore object>, capacity=1000, threshold=0.75, policy=<function _policy_lru>, parent_cache=None)[source]

An implementation of a cache. See https://en.wikipedia.org/wiki/Cache_algorithms

class Item[source]

Cache-private class representing an item in the cache.

class cate.util.cache.CacheStore[source]

Represents a store to which cached values can be stored into and restored from.

can_load_from_key(key) → bool[source]

Test whether a stored value representation can be loaded from the given key. :rtype: bool :param key: the key :return: True, if so

discard_value(key, stored_value)[source]

Discard a value from it’s storage. :param key: the key :param stored_value: the stored representation of the value

load_from_key(key)[source]

Load a stored value representation of the value and its size from the given key. :param key: the key :return: a 2-element sequence containing the stored representation of the value and it’s size

restore_value(key, stored_value)[source]

Restore a vale from its stored representation. :param key: the key :param stored_value: the stored representation of the value :return: the item

store_value(key, value)[source]

Store a value and return it’s stored representation and size in any unit, e.g. in bytes. :param key: the key :param value: the value :return: a 2-element sequence containing the stored representation of the value and it’s size

class cate.util.cache.FileCacheStore(cache_dir: str, ext: str)[source]

Simple file store for values which can be written and read as bytes, e.g. encoded PNG images.

class cate.util.cache.MemoryCacheStore[source]

Simple memory store.

discard_value(key, stored_value)[source]

Clears the value in the given stored_value. :param key: the key :param stored_value: the stored representation of the value

restore_value(key, stored_value)[source]
Parameters:
  • key – the key
  • stored_value – the stored representation of the value
Returns:

the original value.

store_value(key, value)[source]

Return (value, 1). :param key: the key :param value: the original value :return: the tuple (stored value, size) where stored value is the sequence [key, value].

cate.util.cache.POLICY_LFU(item)

Discard Least Frequently Used first

cate.util.cache.POLICY_LRU(item)

Discard Least Recently Used items first

cate.util.cache.POLICY_MRU(item)

Discard Most Recently Used first

cate.util.cache.POLICY_RR(item)

Discard items by Random Replacement

9.12. Module cate.util.cli

class cate.util.cli.Command[source]

Represents a (sub-)command of a command-line interface.

classmethod configure_parser(parser: argparse.ArgumentParser) → None[source]

Configure parser, i.e. make any required parser.add_argument(*args, **kwargs) calls. See https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.add_argument

Parameters:parser (ArgumentParser) – The command parser to configure.
execute(command_args: argparse.Namespace) → None[source]

Execute this command.

The command’s arguments in command_args are attributes namespace returned by argparse.ArgumentParser.parse_args(). Also refer to to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.parse_args

execute``implementations shall raise a ``CommandError instance on failure.

Parameters:command_args (Namespace) – The command’s arguments.
classmethod name() → str[source]
Returns:A unique command name
classmethod new_monitor() → cate.util.monitor.Monitor[source]

Create a new console progress monitor.

Returns:a new Monitor instance.
classmethod parser_kwargs() → dict[source]

Return parser keyword arguments dictionary passed to a argparse.ArgumentParser(**parser_kwargs) call.

For the possible keywords in the returned dictionary, refer to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.

Returns:A keyword arguments dictionary.
exception cate.util.cli.CommandError(cause, *args, **kwargs)[source]

An exception type signaling command-line errors.

Parameters:cause – The cause which may be an Exception or a str.
class cate.util.cli.NoExitArgumentParser(*args, **kwargs)[source]

Special argparse.ArgumentParser that never directly exits the current process. It raises an ExitException instead.

exception ExitException(status, message)[source]

Raises instead of exiting the current process.

exit(status=0, message=None)[source]

Overrides the base class method in order to raise an ExitException.

cate.util.cli.run_main(name: str, description: str, version: str, command_classes: typing.Sequence[cate.util.cli.Command], license_text: str = None, docs_url: str = None, error_message_trimmer=None, args: typing.Sequence[str] = None) → int[source]

A CLI’s entry point function.

To be used in your own code as follows:

>>> if __name__ == '__main__':
>>>    sys.exit(run_main(...))
Return type:

int

Parameters:
  • name (str) – The program’s name.
  • description (str) – The program’s description.
  • version (str) – The program’s version string.
  • command_classes (Sequence) – The CLI commands.
  • license_text (str) – An optional license text.
  • docs_url (str) – An optional documentation URL.
  • error_message_trimmer – An optional callable (str)->str that trims error message strings.
  • args (Sequence) – list of command-line arguments. If not passed, sys.argv[1:] is used.
Returns:

An exit code where 0 stands for success.

9.13. Module cate.util.im

9.13.1. Description

The cate.util.im package provides application-independent utility functions for working with tiled image pyramids.

The Cate project uses this package for implementing a RESTful web service that provides image tiles from image pyramids.

This package is independent of other cate.* packages, but it depends on the following external packages

  • numpy
  • pillow (for PIL)
  • matplotlib

9.13.2. Verification

The module’s unit-tests are located in test/util/im and may be executed using $ py.test test/util/im --cov=cate/util/im for extra code coverage information.

9.13.3. Components

9.14. Module cate.util.web