9. Detailed Design¶
This chapter provides the CCI Toolbox detailed design documentation. Its is generated from the docstrings that are extensively used throughout the Python code.
The documentation is generated for individual modules. Note that this modularisation reflects the effective, internal (and physical) structure of the Python code. This is not the official API, which comprises a relatively stable subset of the components, types, interfaces, and variables describes here and is described in chapter API Reference.
Each top level module documentation in the following sections provides a sub-section Description that provides the module’s purpose, contents, and possibly its usage. Module descriptions may link into Operation Specifications for further explanation and traceability of the detailed design. An optional sub-section Technical Requirements provides a mapping from URD requirements to technical requirements and software features that drove the design of a module. If available, links to verifying unit-tests are given in sub-sections called Verification. The sub-section Components lists all documented, non-private components of a module, including variables, functions, and classes.
9.1. Module cate.core.ds
¶
9.1.1. Description¶
This module provides Cate’s data access API.
9.1.2. Technical Requirements¶
Query data store
Description: | Allow querying registered ECV data stores using a simple function that takes a set of query parameters and returns data source identifiers that can be used to open respective ECV dataset in the Cate. |
---|---|
URD-Source: |
|
Add data store
Description: | Allow adding of user defined data stores specifying the access protocol and the layout of the data. These data stores can be used to access datasets. |
---|---|
URD-Source: |
|
Open dataset
Description: | Allow opening an ECV dataset given an identifier returned by the data store query. The dataset returned complies to the Cate common data model. The dataset to be returned can optionally be constrained in time and space. |
---|---|
URD-Source: |
|
9.1.3. Verification¶
The module’s unit-tests are located in
test/test_ds.py
and may be executed using $ py.test test/test_ds.py --cov=cate/core/ds.py
for extra code coverage information.
9.1.4. Components¶
-
cate.core.ds.
DATA_STORE_REGISTRY
= {'esa_cci_odp': EsaCciOdpDataStore (esa_cci_odp), 'local': LocalFilePatternDataStore('local')}¶ The data data store registry of type
DataStoreRegistry
. Use it add new data stores to Cate.
-
exception
cate.core.ds.
DataAccessError
[source]¶ Exceptions produced by Cate’s data stores and data sources instances, used to report any problems handling data.
-
exception
cate.core.ds.
DataAccessWarning
[source]¶ Warnings produced by Cate’s data stores and data sources instances, used to report any problems handling data.
-
class
cate.core.ds.
DataSource
[source]¶ An abstract data source from which datasets can be retrieved.
-
cache_info
¶ Return information about cached, locally available data sets. The returned dict, if any, is JSON-serializable.
-
data_store
¶ The data store to which this data source belongs.
-
id
¶ Data source identifier.
-
info_string
¶ Return a textual representation of the meta-information about this data source. Useful for CLI / REPL applications.
-
make_local
(local_name: str, local_id: str = None, time_range: Union[Tuple[str, str], Tuple[datetime.datetime, datetime.datetime], Tuple[datetime.date, datetime.date], str] = None, region: Union[<Mock name='mock.geometry.Polygon' id='140232713236832'>, List[Tuple[float, float]], str, Tuple[float, float, float, float]] = None, var_names: Union[List[str], str] = None, monitor: cate.util.monitor.Monitor = Monitor.NONE) → Optional[cate.core.ds.DataSource][source]¶ Turns this (likely remote) data source into a local data source given a name and a number of optional constraints.
If this is a remote data source, data will be downloaded and turned into a local data source which will be added to the data store named “local”.
If this is already a local data source, a new local data source will be created by copying required data or data subsets.
The method returns the newly create local data source.
Parameters: - local_name (
str
) – A human readable name for the new local data source. - local_id (
str
) – A unique ID to be used for the new local data source. If not given, a new ID will be generated. - time_range – An optional time constraint comprising start and end date.
If given, it must be a
TimeRangeLike
. - region – An optional region constraint.
If given, it must be a
PolygonLike
. - var_names – Optional names of variables to be included.
If given, it must be a
VarNamesLike
. - monitor (
Monitor
) – A progress monitor.
Returns: the new local data source
- local_name (
-
matches
(ds_id: str = None, query_expr: str = None) → bool[source]¶ Test if this data source matches the given id or query_expr. If neither id nor query_expr are given, the method returns True.
Return type: bool
Parameters: - ds_id (
str
) – A data source identifier. - query_expr (
str
) – A query expression. Currently, only simple search strings are supported.
Returns: True, if this data sources matches the given id or query_expr.
- ds_id (
-
meta_info
¶ Return meta-information about this data source. The returned dict, if any, is JSON-serializable.
-
open_dataset
(time_range: Union[Tuple[str, str], Tuple[datetime.datetime, datetime.datetime], Tuple[datetime.date, datetime.date], str] = None, region: Union[<Mock name='mock.geometry.Polygon' id='140232713236832'>, List[Tuple[float, float]], str, Tuple[float, float, float, float]] = None, var_names: Union[List[str], str] = None, protocol: str = None, monitor: cate.util.monitor.Monitor = Monitor.NONE) → Any[source]¶ Open a dataset from this data source.
Parameters: - time_range – An optional time constraint comprising start and end date.
If given, it must be a
TimeRangeLike
. - region – An optional region constraint.
If given, it must be a
PolygonLike
. - var_names – Optional names of variables to be included.
If given, it must be a
VarNamesLike
. - protocol (
str
) – Deprecated. Protocol name, if None selected default protocol will be used to access data. - monitor (
Monitor
) – A progress monitor.
Returns: A dataset instance or
None
if no data is available for the given constraints.- time_range – An optional time constraint comprising start and end date.
If given, it must be a
-
schema
¶ The data
Schema
for any dataset provided by this data source orNone
if unknown. Currently unused in cate.
-
status
¶ Return information about data source accessibility
-
temporal_coverage
(monitor: cate.util.monitor.Monitor = Monitor.NONE) → Optional[Tuple[datetime.datetime, datetime.datetime]][source]¶ The temporal coverage as tuple (start, end) where start and end are UTC
datetime
instances.Parameters: monitor ( Monitor
) – a progress monitor.Returns: A tuple of (start, end) UTC datetime
instances orNone
if the temporal coverage is unknown.
-
title
¶ Human-readable data source title. The default implementation tries to retrieve the title from
meta_info['title']
.
-
variables_info
¶ Return meta-information about the variables contained in this data source. The returned dict, if any, is JSON-serializable.
-
-
class
cate.core.ds.
DataSourceStatus
[source]¶ - Enum stating current state of Data Source accessibility.
- READY - data is complete and ready to use
- ERROR - data initialization process has been interrupted, causing that data source is incomplete or/and corrupted
- PROCESSING - data source initialization process is in progress.
- CANCELLED - data initialization process has been intentionally interrupted by user
-
class
cate.core.ds.
DataStore
(ds_id: str, title: str = None, is_local: bool = False)[source]¶ Represents a data store of data sources.
Parameters: - ds_id – Unique data store identifier.
- title – A human-readable tile.
-
description
¶ Return an optional, human-readable description for this data store as plain text.
The text may use Markdown formatting.
-
get_updates
(reset=False) → Dict[source]¶ Ask the datastore to retrieve the differences found between a previous dataStore status and the current one, The implementation return a dictionary with the new [‘new’] and removed [‘del’] dataset. it also return the reference time to the datastore status taken as previous. Reset flag is used to clean up the support files, freeze and diff. :rtype:
Dict
:type reset:bool
:param: reset=False. Set this flag to true to clean up all the support files forcing asynchronization with the remote catalogReturns: A dictionary with keys { ‘generated’, ‘source_ref_time’, ‘new’, ‘del’ }. genetated: generation time, when the check has been executed source_ref_time: when the local copy of the remoted dataset hes been made. It is also used by the system to refresh the current images when is older then 1 day.new: a list of new dataset entry del: a list of removed datset
-
id
¶ Return the unique identifier for this data store.
-
invalidate
()[source]¶ Datastore might use a cached list of available dataset which can change in time. Resources managed by a datastore are external so we have to consider that they can be updated by other process. This method ask to invalidate the internal structure and synchronize it with the current status :return:
-
is_local
¶ Whether this is a remote data source not requiring any internet connection when its
query()
method is called or theopen_dataset()
andmake_local()
methods on one of its data sources.
-
notices
¶ Return an optional list of notices for this data store that can be used to inform users about the conventions, standards, and data extent used in this data store or upcoming service outages.
-
query
(ds_id: str = None, query_expr: str = None, monitor: cate.util.monitor.Monitor = Monitor.NONE) → Sequence[cate.core.ds.DataSource][source]¶ Retrieve data sources in this data store using the given constraints.
Return type: Parameters: - ds_id (
str
) – Data source identifier. - query_expr (
str
) – Query expression which may be used if ìd is unknown. - monitor (
Monitor
) – A progress monitor.
Returns: Sequence of data sources.
- ds_id (
-
title
¶ Return a human-readable tile for this data store.
-
class
cate.core.ds.
DataStoreNotice
(id: str, title: str, content: str, intent: str = None, icon: str = None)[source]¶ A short notice that can be exposed to users by data stores.
-
exception
cate.core.ds.
NetworkError
[source]¶ Exceptions produced by Cate’s data stores and data sources instances, used to report any problems with the network or in case an endpoint couldn’t be found nor reached.
-
cate.core.ds.
find_data_sources
(data_stores: Union[cate.core.ds.DataStore, Sequence[cate.core.ds.DataStore]] = None, ds_id: str = None, query_expr: str = None) → Sequence[cate.core.ds.DataSource][source]¶ Find data sources in the given data store(s) matching the given id or query_expr.
See also
open_dataset()
.Return type: Parameters: - data_stores – If given these data stores will be queried. Otherwise all registered data stores will be used.
- ds_id (
str
) – A data source identifier. - query_expr (
str
) – A query expression.
Returns: All data sources matching the given constrains.
-
cate.core.ds.
find_data_sources_update
(data_stores: Union[cate.core.ds.DataStore, Sequence[cate.core.ds.DataStore]] = None) → Dict[source]¶ find difference in the list of data source of the given data store (all when None). The updateds will be returned as dictionaty where the key is the Data store ID. The value is a dictionary too contining the list of ‘new’, ‘de’ (removed) dataset :rtype:
Dict
:param data_stores: list of Data store(s) to be cheked. If None all the refgistered Data storewill be checkedReturns: dictionary index by data store ID, values are a second dictionary with the updates sorted by new and del data source in addition to source_ref_time which is the time of snapshot used to compare the data source list
-
cate.core.ds.
format_cached_datasets_coverage_string
(cache_coverage: dict) → str[source]¶ Return a textual representation of information about cached, locally available data sets. Useful for CLI / REPL applications. :rtype:
str
:type cache_coverage:dict
:param cache_coverage: :return:
-
cate.core.ds.
format_variables_info_string
(variables: dict)[source]¶ Return some textual information about the variables contained in this data source. Useful for CLI / REPL applications. :type variables:
dict
:param variables: :return:
-
cate.core.ds.
get_ext_chunk_sizes
(ds: <Mock name='mock.Dataset' id='140232713239184'>, dim_names: Set[str] = None, init_value=0, map_fn=<built-in function max>, reduce_fn=None) → Dict[str, int][source]¶ Get the external chunk sizes for each dimension of a dataset as provided in a variable’s encoding object.
Return type: Parameters: - ds – The dataset.
- dim_names (
Set
) – The names of dimensions of data variables whose external chunking should be collected. - init_value (
int
) – The initial value (not necessarily a chunk size) for mapping multiple different chunk sizes. - map_fn – The mapper function that maps a chunk size from a previous (initial) value.
- reduce_fn – The reducer function the reduces multiple mapped chunk sizes to a single one.
Returns: A mapping from dimension name to external chunk sizes.
-
cate.core.ds.
get_spatial_ext_chunk_sizes
(ds_or_path: Union[<Mock name='mock.Dataset' id='140232713239184'>, str]) → Dict[str, int][source]¶ Get the spatial, external chunk sizes for the latitude and longitude dimensions of a dataset as provided in a variable’s encoding object.
Return type: Dict
Parameters: ds_or_path – An xarray dataset or a path to file that can be opened by xarray. Returns: A mapping from dimension name to external chunk sizes.
-
cate.core.ds.
open_dataset
(data_source: Union[cate.core.ds.DataSource, str], time_range: Union[Tuple[str, str], Tuple[datetime.datetime, datetime.datetime], Tuple[datetime.date, datetime.date], str] = None, region: Union[<Mock name='mock.geometry.Polygon' id='140232713236832'>, List[Tuple[float, float]], str, Tuple[float, float, float, float]] = None, var_names: Union[List[str], str] = None, force_local: bool = False, local_ds_id: str = None, monitor: cate.util.monitor.Monitor = Monitor.NONE) → Any[source]¶ Open a dataset from a data source.
Parameters: - data_source – A
DataSource
object or a string. Strings are interpreted as the identifier of an ECV dataset and must not be empty. - time_range – An optional time constraint comprising start and end date.
If given, it must be a
TimeRangeLike
. - region – An optional region constraint.
If given, it must be a
PolygonLike
. - var_names – Optional names of variables to be included.
If given, it must be a
VarNamesLike
. - force_local (
bool
) – Optional flag for remote data sources only Whether to make a local copy of data source if it’s not present - local_ds_id (
str
) – Optional, fpr remote data sources only Local data source ID for newly created copy of remote data source - monitor (
Monitor
) – A progress monitor
Returns: An new dataset instance
- data_source – A
-
cate.core.ds.
open_xarray_dataset
(paths, region: Union[<Mock name='mock.geometry.Polygon' id='140232713236832'>, List[Tuple[float, float]], str, Tuple[float, float, float, float]] = None, var_names: Union[List[str], str] = None, monitor: cate.util.monitor.Monitor = Monitor.NONE, **kwargs) → <Mock name='mock.Dataset' id='140232713239184'>[source]¶ Open multiple files as a single dataset. This uses dask. If each individual file of the dataset is small, one Dask chunk will coincide with one temporal slice, e.g. the whole array in the file. Otherwise smaller dask chunks will be used to split the dataset.
Parameters: - paths – Either a string glob in the form “path/to/my/files/*.nc” or an explicit list of files to open.
- region – Optional region constraint.
- var_names – Optional variable names constraint.
- monitor (
Monitor
) – Optional progress monitor. - kwargs – Keyword arguments directly passed to
xarray.open_mfdataset()
9.2. Module cate.core.op
¶
9.2.1. Description¶
This modules provides classes and functions allowing to maintain operations. Operations can be called from
the Cate command-line interface, may be referenced from within processing workflows, or may be called remotely
e.g. from graphical user interface or web frontend. An operation (Operation
) comprises a Python callable
and some additional meta-information (OpMetaInfo
) that allows for automatic input validation,
input value conversion, monitoring, and inter-connection of multiple operations using processing workflows and steps.
Operations are registered in operation registries (OpRegistry
), the default operation registry is
accessible via the global, read-only OP_REGISTRY
variable.
9.2.2. Technical Requirements¶
Operation registration, lookup, and invocation
Description: | Maintain a central place in the software that manages the available operations such as data processors, data converters, analysis functions, etc. Operations can be added, removed and retrieved. Operations are designed to be executed by the framework in a controlled way, i.e. an operation’s task can be monitored and cancelled, it’s input and out values can be validated w.r.t. the operation’s meta-information. |
---|---|
URD-Sources: |
|
Exploit Python language features
Description: | Exploit Python language to let API users express an operation in an intuitive form. For the framework API,
stay with Python base types as far as possible instead of introducing a number of new data structures.
Let the framework derive meta information such as names, types and documentation for the operation, its inputs,
and its outputs from the user’s Python code.
It shall be possible to register any Python-callable of the from f(*args, **kwargs) as an operation. |
---|
Add extra meta-information to operations
Description: | Initial operation meta-information will be derived from Python code introspection. It shall include the user function’s docstring and information about the arguments an its return values, exploiting any type annotations. For example, the following properties can be associated with input arguments: data type, default value, value set, valid range, if it is mandatory or optional, expected dataset schema so that operations can be ECV-specific. Meta-information is required to let an operation explain itself when used in a (IPython) REPL or when web service is requested to respond with an operations’s capabilities. API users shall be able to extend the initial meta-information derived from Python code. |
---|---|
URD-Source: |
|
Static annotation vs. dynamic, programmatic registration
Description: | Operation registration and meta-information extension shall also be done by operation class / function decorators. The API shall provide a simple set of dedicated decorators that API user’s attach to their operations. They will automatically register the user function as operation and add any extra meta-information. |
---|
Operation monitoring
Description: | Operation registration should recognise an optional monitor argument of a user function:
f(*args, monitor=Monitor.NONE, **kwargs) . In this case the a monitor (of type Monitor )
will be passed by the framework to the user function in order to observe the progress and to cancel an operation. |
---|
9.2.3. Verification¶
The module’s unit-tests are located in
test/test_op.py and may be executed using
$ py.test test/test_op.py --cov=cate/core/plugin.py
for extra code coverage information.
9.2.4. Components¶
-
cate.core.op.
OP_REGISTRY
= OP_REGISTRY¶ The default operation registry of type
cate.core.op.OpRegistry
.
-
class
cate.core.op.
OpRegistry
[source]¶ An operation registry allows for addition, removal, and retrieval of operations.
-
add_op
(operation: Callable, fail_if_exists=True, replace_if_exists=False) → cate.core.op.Operation[source]¶ Add a new operation registration.
Return type: Parameters: - operation (
Callable
) – A operation object such as a class or any callable. - fail_if_exists (
bool
) – raiseValueError
if the operation was already registered - replace_if_exists (
bool
) – replaces an existing operation if fail_if_exists isFalse
Returns: a new or existing
cate.core.op.Operation
- operation (
-
get_op
(operation, fail_if_not_exists=False) → cate.core.op.Operation[source]¶ Get an operation registration.
Return type: Parameters: - operation – A fully qualified operation name or operation object such as a class or any callable.
- fail_if_not_exists (
bool
) – raiseValueError
if no such operation was found
Returns: a
cate.core.op.Operation
object orNone
if fail_if_not_exists isFalse
.
-
get_op_key
(operation: Union[str, Callable])[source]¶ Get a key under which the given operation will be registered.
Parameters: operation – A fully qualified operation name or a callable object Returns: The operation key
-
op_registrations
¶ Get all operation registrations of type
cate.core.op.Operation
.Returns: a mapping of fully qualified operation names to operation registrations
-
remove_op
(operation: Callable, fail_if_not_exists=False) → Optional[cate.core.op.Operation][source]¶ Remove an operation registration.
Parameters: - operation (
Callable
) – A fully qualified operation name or operation object such as a class or any callable. - fail_if_not_exists (
bool
) – raiseValueError
if no such operation was found
Returns: the removed
cate.core.op.Operation
object orNone
if fail_if_not_exists isFalse
.- operation (
-
-
class
cate.core.op.
Operation
(wrapped_op: Callable, op_meta_info=None)[source]¶ An Operation comprises a wrapped callable (e.g. function, constructor, lambda form) and additional meta-information about the wrapped operation itself and its inputs and outputs.
Parameters: - wrapped_op – some callable object that will be wrapped.
- op_meta_info – operation meta information.
-
op_meta_info
¶ Returns: Meta-information about the operation, see cate.core.op.OpMetaInfo
.
-
wrapped_op
¶ Returns: The actual operation object which may be any callable.
-
cate.core.op.
new_expression_op
(op_meta_info: cate.util.opmetainf.OpMetaInfo, expression: str) → cate.core.op.Operation[source]¶ Create an operation that wraps a Python expression.
Return type: Parameters: - op_meta_info (
OpMetaInfo
) – Meta-information about the resulting operation and the operation’s inputs and outputs. - expression (
str
) – The Python expression. May refer to any name given in op_meta_info.input.
Returns: The Python expression wrapped into an operation.
- op_meta_info (
-
cate.core.op.
new_subprocess_op
(op_meta_info: cate.util.opmetainf.OpMetaInfo, command_pattern: str, run_python: bool = False, cwd: Optional[str] = None, env: Dict[str, str] = None, shell: bool = False, started: Union[str, Callable] = None, progress: Union[str, Callable] = None, done: Union[str, Callable] = None) → cate.core.op.Operation[source]¶ Create an operation for a child program run in a new process.
Return type: Parameters: - op_meta_info (
OpMetaInfo
) – Meta-information about the resulting operation and the operation’s inputs and outputs. - command_pattern (
str
) – A pattern that will be interpolated to obtain the actual command to be executed. May contain “{input_name}” fields which will be replaced by the actual input value converted to text. input_name must refer to a valid operation input name in op_meta_info.input or it must be the value of either the “write_to” or “read_from” property of another input’s property map. - run_python (
bool
) – If True, command_pattern refers to a Python script which will be executed with the Python interpreter that Cate uses. - cwd – Current working directory to run the command line in.
- env (
Dict
) – Environment variables passed to the shell that executes the command line. - shell (
bool
) – Whether to use the shell as the program to execute. - started – Either a callable that receives a text line from the executable’s stdout and returns a tuple (label, total_work) or a regex that must match in order to signal the start of progress monitoring. The regex must provide the group names “label” or “total_work” or both, e.g. “(?P<label>w+)” or “(?P<total_work>d+)”
- progress – Either a callable that receives a text line from the executable’s stdout and returns a tuple (work, msg) or a regex that must match in order to signal process. The regex must provide group names “work” or “msg” or both, e.g. “(?P<msg>w+)” or “(?P<work>d+)”
- done – Either a callable that receives a text line a text line from the executable’s stdout and returns True or False or a regex that must match in order to signal the end of progress monitoring.
Returns: The executable wrapped into an operation.
- op_meta_info (
-
cate.core.op.
op
(tags=UNDEFINED, version=UNDEFINED, res_pattern=UNDEFINED, deprecated=UNDEFINED, registry=OP_REGISTRY, **properties)[source]¶ op
is a decorator function that registers a Python function or class in the default operation registry or the one given by registry, if any. Any other keywords arguments in header are added to the operation’s meta-information header. Classes annotated by this decorator must have callable instances.When a function is registered, an introspection is performed. During this process, initial operation the meta-information header property description is derived from the function’s docstring.
If any output of this operation will have its history information automatically updated, there should be version information found in the operation header. Thus it’s always a good idea to add it to all operations:
@op(version='X.x')
Parameters: - tags – An optional list of string tags.
- version – An optional version string.
- res_pattern – An optional pattern that will be used to generate the names for data resources that are
used to hold a reference to the objects returned by the operation and that are cached in a Cate workspace.
Currently, the only pattern variable that is supported and that must be present is
{index}
which will be replaced by an integer number that is guaranteed to produce a unique resource name. - deprecated – An optional boolean or a string. If a string is used, it should explain
why the operation has been deprecated and which new operation to use instead.
If set to
True
, the operation’s doc-string should explain the deprecation. - registry – The operation registry.
- properties – Other properties (keyword arguments) that will be added to the meta-information of operation.
-
cate.core.op.
op_input
(input_name: str, default_value=UNDEFINED, units=UNDEFINED, data_type=UNDEFINED, nullable=UNDEFINED, value_set_source=UNDEFINED, value_set=UNDEFINED, value_range=UNDEFINED, script_lang=UNDEFINED, deprecated=UNDEFINED, position=UNDEFINED, context=UNDEFINED, registry=OP_REGISTRY, **properties)[source]¶ op_input
is a decorator function that provides meta-information for an operation input identified by input_name. If the decorated function or class is not registered as an operation yet, it is added to the default operation registry or the one given by registry, if any.When a function is registered, an introspection is performed. During this process, initial operation meta-information input properties are derived for each positional and keyword argument named input_name:
Derived property Source position The position of a positional argument, e.g. 2
for inputz
indef f(x, y, z, c=2)
.default_value The value of a keyword argument, e.g. 52.3
for inputlatitude
from argument definitionlatitude:float=52.3
data_type The type annotation type, e.g. float
for inputlatitude
from argument definitionlatitude:float
The derived properties listed above plus any of value_set, value_range, and any key-value pairs in properties are added to the input’s meta-information. A key-value pair in properties will always overwrite the derived properties listed above.
Parameters: - input_name (
str
) – The name of an input. - default_value – A default value.
- units – The geo-physical units of the input value.
- data_type – The data type of the input values. If not given, the type of any given, non-None default_value is used.
- nullable – If
True
, the value of the input may beNone
. If not given, it will be set toTrue
if the default_value isNone
. - value_set_source – The name of an input, which can be used to generate a dynamic value set.
- value_set – A sequence of the valid values. Note that all values in this sequence must be compatible with data_type.
- value_range – A sequence specifying the possible range of valid values.
- script_lang – The programming language for a parameter of data_type “str” that provides source code of a script, e.g. “python”.
- deprecated – An optional boolean or a string. If a string is used, it should explain
why the input has been deprecated and which new input to use instead.
If set to
True
, the input’s doc-string should explain the deprecation. - position – The zero-based position of an input.
- context – If
True
, the value of the operation input will be a dictionary representing the current execution context. For example, when the operation is executed from a workflow, the dictionary will hold at least three entries:workflow
provides the current workflow,step
is the currently executed step, andvalue_cache
which is a mapping from step identifiers to step outputs. If context is a string, the value of the operation input will be the result of evaluating the string as Python expression with the current execution context as local environment. This means, context may be an expression such as ‘value_cache’, ‘workspace.base_dir’, ‘step’, ‘step.id’. - properties – Other properties (keyword arguments) that will be added to the meta-information of the named output.
- registry – Optional operation registry.
- input_name (
-
cate.core.op.
op_output
(output_name: str, data_type=UNDEFINED, deprecated=UNDEFINED, registry=OP_REGISTRY, **properties)[source]¶ op_output
is a decorator function that provides meta-information for an operation output identified by output_name. If the decorated function or class is not registered as an operation yet, it is added to the default operation registry or the one given by registry, if any.If your function does not return multiple named outputs, use the
op_return()
decorator function. Note that:@op_return(...) def my_func(...): ...
if equivalent to:
@op_output('return', ...) def my_func(...): ...
To automatically add information about cate, its version, this operation and its inputs, to this output, set ‘add_history’ to True:
@op_output('name', add_history=True)
Note that the operation should have version information added to it when add_history is True:
@op(version='X.x')
Parameters: - output_name (
str
) – The name of the output. - data_type – The data type of the output value.
- deprecated – An optional boolean or a string. If a string is used, it should explain
why the output has been deprecated and which new output to use instead.
If set to
True
, the output’s doc-string should explain the deprecation. - properties – Other properties (keyword arguments) that will be added to the meta-information of the named output.
- registry – Optional operation registry.
- output_name (
-
cate.core.op.
op_return
(data_type=UNDEFINED, registry=OP_REGISTRY, **properties)[source]¶ op_return
is a decorator function that provides meta-information for a single, anonymous operation return value (whose output name is"return"
). If the decorated function or class is not registered as an operation yet, it is added to the default operation registry or the one given by registry, if any. Any other keywords arguments in properties are added to the output’s meta-information.When a function is registered, an introspection is performed. During this process, initial operation meta-information output properties are derived from the function’s return type annotation, that is data_type will be e.g.
float
if a function is annotated asdef f(x, y) -> float: ...
.The derived data_type property and any key-value pairs in properties are added to the output’s meta-information. A key-value pair in properties will always overwrite a derived data_type.
If your function returns multiple named outputs, use the
op_output()
decorator function. Note that:@op_return(...) def my_func(...): ...
if equivalent to:
@op_output('return', ...) def my_func(...): ...
To automatically add information about cate, its version, this operation and its inputs, to this output, set ‘add_history’ to True:
@op_return(add_history=True)
Note that the operation should have version information added to it when add_history is True:
@op(version='X.x')
Parameters: - data_type – The data type of the return value.
- properties – Other properties (keyword arguments) that will be added to the meta-information of the return value.
- registry – The operation registry.
9.3. Module cate.core.workflow
¶
9.3.1. Description¶
Provides classes that are used to construct processing workflows (networks, directed acyclic graphs) from processing steps including Python callables, Python expressions, external processes, and other workflows.
This module provides the following data types:
- A
Node
has zero or more inputs and zero or more outputs and can be invoked - A
Workflow
is aNode
that is composed ofStep
objects - A
Step
is aNode
that is part of aWorkflow
and performs some kind of data processing. - A
OpStep
is aStep
that invokes a Python operation (any callable). - A
ExpressionStep
is aStep
that executes a Python expression string. - A
WorkflowStep
is aStep
that executes aWorkflow
loaded from an external (JSON) resource. - A
NodePort
belongs to exactly oneNode
. Node ports represent both the named inputs and outputs of node. A node port has a name, a propertysource
, and a propertyvalue
. Ifsource
is set, it must be anotherNodePort
that provides the actual port’s value. The value of thevalue
property can be basically anything that has an external (JSON) representation.
Workflow input ports are usually unspecified, but value
may be set.
Workflow output ports and a step’s input ports are usually connected with output ports of other contained steps
or inputs of the workflow via the source
attribute.
A step’s output ports are usually unconnected because their value
attribute is set by a step’s concrete
implementation.
Step node inputs and workflow outputs are indicated in the input specification of a node’s external JSON representation:
{"source": "NODE_ID.PORT_NAME" }
: the output (or input) named PORT_NAME of another node given by NODE_ID.{"source": ".PORT_NAME" }
: current step’s output (or input) named PORT_NAME or of any of its parents.{"source": "NODE_ID" }
: the one and only output of a workflow or of one of its nodes given by NODE_ID.{"value": NUM|STR|LIST|DICT|null }
: a constant (JSON) value.
Workflows are callable by the CLI in the same way as single operations. The command line form for calling an operation is currently::
cate run OP|WORKFLOW [ARGS]
Where OP is a registered operation and WORKFLOW is a JSON file containing a JSON workflow representation.
9.3.2. Technical Requirements¶
Combine processors and other operations to create operation chains or processing graphs
Description: | Provide the means to connect multiple processing steps, which may be registered operations, operating system calls, remote service invocations. |
---|---|
URD-Sources: |
|
Integration of external, ECV-specific programs
Description: | Some processing step might only be solved by executing an external tool. Therefore, a special workflow step shall allow for invocation of external programs hereby mapping input values to program arguments, and program outputs to step outputs. It shall also be possible to monitor the state of the running sub-process. |
---|---|
URD-Source: |
|
Programming language neutral representation
Description: | Processing graphs must be representable in a programming language neutral representation such as XML, JSON, YAML, so they can be designed by non-programmers and can be easily serialised, e.g. for communication with a web service. |
---|---|
URD-Source: |
|
9.3.3. Verification¶
The module’s unit-tests are located in
test/test_workflow.py
and may be executed using $ py.test test/test_workflow.py --cov=cate/core/workflow.py
for extra code
coverage information.
9.3.4. Components¶
-
class
cate.core.workflow.
ExpressionStep
(expression: str, inputs=None, outputs=None, node_id=None)[source]¶ An
ExpressionStep
is a step node that computes its output from a simple (Python) expression string.Parameters: - expression – A simple (Python) expression string.
- inputs – input name to input properties mapping.
- outputs – output name to output properties mapping.
- node_id – A node ID. If None, an ID will be generated.
-
class
cate.core.workflow.
NoOpStep
(inputs: dict = None, outputs: dict = None, node_id: str = None)[source]¶ A
NoOpStep
“performs” a no-op, which basically means, it does nothing. However, it might still be useful to define step that or duplicates or renames output values by connecting its own output ports with any of its own input ports. In other cases it might be useful to have aNoOpStep
as a placeholder or blackbox for some other real operation that will be put into place at a later point in time.Parameters: - inputs – input name to input properties mapping.
- outputs – output name to output properties mapping.
- node_id – A node ID. If None, an ID will be generated.
-
class
cate.core.workflow.
Node
(op_meta_info: cate.util.opmetainf.OpMetaInfo, node_id: str = None)[source]¶ Base class for all nodes including parent nodes (e.g.
Workflow
) and child nodes (e.g.Step
).All nodes have inputs and outputs, and can be invoked to perform some operation.
Inputs and outputs are exposed as attributes of the
input
andoutput
properties and are both of typeNodePort
.Parameters: node_id – A node ID. If None, a name will be generated. -
call
(context: Dict = None, monitor=Monitor.NONE, input_values: Dict = None)[source]¶ Calls this workflow with given input_values and returns the result.
The method does the following: 1. Set default_value where input values are missing in input_values 2. Validate the input_values using this workflows’s meta-info 3. Set this workflow’s input port values 4. Invoke this workflow with given context and monitor 5. Get this workflow’s output port values. Named outputs will be returned as dictionary.
Parameters: Returns: The output values.
-
collect_predecessors
(predecessors: List[Node], excludes: List[Node] = None)[source]¶ Collect this node (self) and preceding nodes in predecessors.
-
find_node
(node_id) → Optional[cate.core.workflow.Node][source]¶ Find a (child) node with the given node_id.
-
find_port
(name) → Optional[cate.core.workflow.NodePort][source]¶ Find port with given name. Output ports are searched first, then input ports. :param name: The port name :return: The port, or
None
if it couldn’t be found.
-
id
¶ The node’s identifier.
-
inputs
¶ The node’s inputs.
-
invoke
(context: Dict = None, monitor: cate.util.monitor.Monitor = Monitor.NONE) → None[source]¶ Invoke this node’s underlying operation with input values from
input
. Output values inoutput
will be set from the underlying operation’s return value(s).Parameters: - context (
Dict
) – An optional execution context. - monitor (
Monitor
) – An optional progress monitor.
- context (
-
max_distance_to
(other_node: cate.core.workflow.Node) → int[source]¶ If other_node is a source of this node, then return the number of connections from this node to node. If it is a direct source return
1
, if it is a source of the source of this node return2
, etc. If other_node is this node, return 0. If other_node is not a source of this node, return -1.Return type: int
Parameters: other_node – The other node. Returns: The distance to other_node
-
op_meta_info
¶ The node’s operation meta-information.
-
outputs
¶ The node’s outputs.
-
parent_node
¶ The node’s parent node or
None
if this node has no parent.
-
requires
(other_node: cate.core.workflow.Node) → bool[source]¶ Does this node require other_node for its computation? Is other_node a source of this node?
Return type: bool
Parameters: other_node – The other node. Returns: True
if this node is a target of other_node
-
root_node
¶ The root_node node.
-
set_id
(node_id: str) → None[source]¶ Set the node’s identifier.
Parameters: node_id ( str
) – The new node identifier. Must be unique within a workflow.
-
-
class
cate.core.workflow.
NodePort
(node: cate.core.workflow.Node, name: str)[source]¶ Represents a named input or output port of a
Node
.-
to_json
(force_dict=False)[source]¶ Return a JSON-serializable dictionary representation of this object.
Returns: A JSON-serializable dictionary
-
update_source
()[source]¶ Resolve this node port’s source reference, if any.
If the source reference has the form node-id.port-name then node-id must be the ID of the workflow or any contained step and port-name must be a name either of one of its input or output ports.
If the source reference has the form .port-name then node-id will refer to either the current step or any of its parent nodes that contains an input or output named port-name.
If the source reference has the form node-id then node-id must be the ID of the workflow or any contained step which has exactly one output.
If node-id refers to a workflow, then port-name is resolved first against the workflow’s inputs followed by its outputs. If node-id refers to a workflow’s step, then port-name is resolved first against the step’s outputs followed by its inputs.
Raises: ValueError – if the source reference is invalid.
-
update_source_node_id
(node: cate.core.workflow.Node, old_node_id: str) → None[source]¶ A node identifier has changed so we update the source references and clear the source of input and output ports from old_node_id to node.id.
Parameters: - node (
Node
) – The node whose identifier changed. - old_node_id (
str
) – The former node identifier.
- node (
-
-
class
cate.core.workflow.
OpStep
(operation, node_id: str = None, registry=OP_REGISTRY)[source]¶ An OpStep is a step node that invokes a registered operation of type
Operation
.Parameters: - operation – A fully qualified operation name or operation object such as a class or callable.
- registry – An operation registry to be used to lookup the operation, if given by name.
- node_id – A node ID. If None, a unique ID will be generated.
-
class
cate.core.workflow.
OpStepBase
(op: cate.core.op.Operation, node_id: str = None)[source]¶ Base class for concrete steps based on an
Operation
.Parameters: - op – An
Operation
object. - node_id – A node ID. If None, a unique ID will be generated.
-
op
¶ The operation registration. See
cate.core.op.Operation
- op – An
-
class
cate.core.workflow.
SourceRef
(node_id, port_name)¶ -
node_id
¶ Alias for field number 0
-
port_name
¶ Alias for field number 1
-
-
class
cate.core.workflow.
Step
(op_meta_info: cate.util.opmetainf.OpMetaInfo, node_id: str = None)[source]¶ A step is an inner node of a workflow.
Parameters: node_id – A node ID. If None, a name will be generated. -
enhance_json_dict
(node_dict: collections.OrderedDict)[source]¶ Enhance the given JSON-compatible node_dict by step specific elements.
-
classmethod
new_step_from_json_dict
(json_dict, registry=OP_REGISTRY) → Optional[cate.core.workflow.Step][source]¶ Create a new step node instance from the given json_dict
-
parent_node
¶ The node’s ID.
-
persistent
¶ Return whether this step is persistent. That is, if the current workspace is saved, the result(s) of a persistent step may be written to a “resource” file in the workspace directory using this step’s ID as filename. The file format and filename extension will be chosen according to each result’s data type. On next attempt to execute the step again, e.g. if a workspace is opened, persistent steps may read the “resource” file to produce the result rather than performing an expensive re-computation. :return: True, if so, False otherwise
-
-
class
cate.core.workflow.
SubProcessStep
(command: str, run_python: bool = False, env: Dict[str, str] = None, cwd: str = None, shell: bool = False, started_re: str = None, progress_re: str = None, done_re: str = None, inputs: Dict[str, Dict] = None, outputs: Dict[str, Dict] = None, node_id: str = None)[source]¶ A
SubProcessStep
is a step node that computes its output by a sub-process created from the given program.Parameters: - command – A pattern that will be interpolated by input values to obtain the actual command (program with arguments) to be executed. May contain “{input_name}” fields which will be replaced by the actual input value converted to text. input_name must refer to a valid operation input name in op_meta_info.input or it must be the value of either the “write_to” or “read_from” property of another input’s property map.
- run_python – If True, command_line_pattern refers to a Python script which will be executed with the Python interpreter that Cate uses.
- cwd – Current working directory to run the command line in.
- env – Environment variables passed to the shell that executes the command line.
- shell – Whether to use the shell as the program to execute.
- started_re – A regex that must match a text line from the process’ stdout in order to signal the start of progress monitoring. The regex must provide the group names “label” or “total_work” or both, e.g. “(?P<label>w+)” or “(?P<total_work>d+)”
- progress_re – A regex that must match a text line from the process’ stdout in order to signal process. The regex must provide group names “work” or “msg” or both, e.g. “(?P<msg>w+)” or “(?P<work>d+)”
- done_re – A regex that must match a text line from the process’ stdout in order to signal the end of progress monitoring.
- inputs – input name to input properties mapping.
- outputs – output name to output properties mapping.
- node_id – A node ID. If None, an ID will be generated.
-
class
cate.core.workflow.
ValueCache
[source]¶ ValueCache
is a closable dictionary that maintains unique IDs for it’s keys. If aValueCache
is closed, all closable values are also closed. A value is closeable if it has aclose
attribute whose value is a callable.
-
cate.core.workflow.
WORKFLOW_SCHEMA_VERSION
= 1¶ Version number of Workflow JSON schema. Will be incremented with the first schema change after public release.
-
class
cate.core.workflow.
Workflow
(op_meta_info: cate.util.opmetainf.OpMetaInfo, node_id: str = None)[source]¶ A workflow of (connected) steps.
Parameters: - op_meta_info – Meta-information object of type
OpMetaInfo
. - node_id – A node ID. If None, an ID will be generated.
-
find_node
(step_id: str) → Optional[cate.core.workflow.Step][source]¶ Find a (child) node with the given node_id.
-
find_steps_to_compute
(step_id: str) → List[cate.core.workflow.Step][source]¶ Compute the list of steps required to compute the output of the step with the given step_id. The order of the returned list is its execution order, with the step given by step_id is the last one.
Return type: List
Parameters: step_id ( str
) – The step to be computed last and whose output value is requested.Returns: a list of steps, which is never empty
-
invoke_steps
(steps: List[Step], context: Dict = None, monitor_label: str = None, monitor=Monitor.NONE) → None[source]¶ Invoke just the given steps.
Parameters:
-
classmethod
load
(file_path_or_fp: Union[str, io.IOBase], registry=OP_REGISTRY) → cate.core.workflow.Workflow[source]¶ Load a workflow from a file or file pointer. The format is expected to be “Workflow JSON”.
Parameters: - file_path_or_fp – file path or file pointer
- registry – Operation registry
Returns: a workflow
-
remove_orphaned_sources
(removed_node: cate.core.workflow.Node)[source]¶ Remove all input/output ports, whose source is still referring to removed_node. :type removed_node:
Node
:param removed_node: A removed node.
-
classmethod
sort_steps
(steps: List[Step])[source]¶ Sorts the list of workflow steps in the order they they can be executed.
-
sorted_steps
¶ The workflow steps in the order they they can be executed.
-
steps
¶ The workflow steps in the order they where added.
-
store
(file_path_or_fp: Union[str, io.IOBase]) → None[source]¶ Store a workflow to a file or file pointer. The format is “Workflow JSON”.
Parameters: file_path_or_fp – file path or file pointer
- op_meta_info – Meta-information object of type
-
class
cate.core.workflow.
WorkflowStep
(workflow: cate.core.workflow.Workflow, resource: str, node_id: str = None)[source]¶ A WorkflowStep is a step node that invokes an externally stored
Workflow
.Parameters: - workflow – The referenced workflow.
- resource – A resource (e.g. file path, URL) from which the workflow was loaded.
- node_id – A node ID. If None, an ID will be generated.
-
enhance_json_dict
(node_dict: collections.OrderedDict)[source]¶ Enhance the given JSON-compatible node_dict by step specific elements.
-
classmethod
new_step_from_json_dict
(json_dict, registry=OP_REGISTRY)[source]¶ Create a new step node instance from the given json_dict
-
resource
¶ The workflow’s resource path (file path, URL).
-
workflow
¶ The workflow.
-
cate.core.workflow.
new_workflow_op
(workflow_or_path: Union[str, cate.core.workflow.Workflow]) → cate.core.op.Operation[source]¶ Create an operation from a workflow read from the given path.
Return type: Operation
Parameters: workflow_or_path – Either a path to Workflow JSON file or Workflow
object.Returns: The workflow operation.
9.4. Module cate.core.plugin
¶
9.4.1. Description¶
The cate.core.plugin
module exposes the Cate’s plugin REGISTRY
which is mapping from Cate entry point names to
plugin meta information. An Cate plugin is any callable in an internal/extension module registered with cate_plugins
entry point.
Clients register a Cate plugin in the setup()
call of their setup.py
script. The following plugin example
comprises a main module cate_wavelet_gapfill
which provides the entry point function cate_init
::
setup(
name="cate-gapfill-wavelet",
version="0.5",
description='A wavelet-based gap-filling algorithm for the ESA CCI Toolbox',
license='GPL 3',
author='John Doe',
packages=['cate_wavelet_gapfill'],
entry_points={
'cate_plugins': [
'cate_wavelet_gapfill = cate_wavelet_gapfill:cate_init',
],
},
install_requires=['pywavelets >= 2.1'],
)
The entry point callable should have the following signature:
def cate_init(*args, **kwargs):
pass
or:
class EctInit:
def __init__(*args, **kwargs)__:
pass
The return values are ignored.
9.4.2. Verification¶
The module’s unit-tests are located in
test/test_plugin.py
and may be executed using
$ py.test test/test_plugin.py --cov=cate/core/plugin.py
for extra code coverage information.
9.5. Module cate.conf
¶
9.6. Module cate.ds
¶
9.6.1. Description¶
The ds
package comprises all specific data source implementations.
This is a plugin package automatically imported by the installation script’s entry point cate_ds
(see the projects setup.py
file).
9.6.2. Verification¶
The module’s unit-tests are located in test/ds and may
be executed using $ py.test test/ops/test_<MODULE>.py --cov=cate/ops/<MODULE>.py
for extra code coverage
information.
9.6.3. Components¶
9.7. Module cate.ops
¶
9.8. Module cate.cli.main
¶
9.8.1. Description¶
This module provides Cate’s CLI executable.
To use the CLI executable, invoke the module file as a script, type python3 cate/cli/main.py [ARGS] [OPTIONS]
.
Type python3 cate/cli/main.py –help` for usage help.
The CLI operates on sub-commands. New sub-commands can be added by inheriting from the Command
class
and extending the Command.REGISTRY
list of known command classes.
9.8.2. Technical Requirements¶
Extensible CLI with multiple sub-commands
Description: | The CCI Toolbox should only have a single CLI executable that comes with multiple sub-commands instead of maintaining a number of different executables for each purpose. Plugins shall be able to add new CLI sub-commands. |
---|---|
URD-Source: |
|
Run operations and workflows
Description: | Allow for executing registered operations an workflows composed of operations. |
---|---|
URD-Source: |
|
List available data, operations and extensions
Description: | Allow for listing dynamic content including available data, operations and plugin extensions. |
---|---|
URD-Source: |
|
Display information about available climate data sources
Description: | Before downloading ECV datasets to the local computer, users shall be able to display information about them, e.g. included variables, total size, spatial and temporal resolution. |
---|---|
URD-Source: |
|
Synchronize locally cached climate data
Description: | Allow for listing dynamic content including available data, operations and plugin extensions. |
---|---|
URD-Source: |
|
9.8.3. Verification¶
The module’s unit-tests are located in
test/cli/test_main.py
and may be executed using $ py.test test/cli/test_main.py --cov=cate/cli/test_main.py
for extra code coverage information.
9.8.4. Components¶
-
cate.cli.main.
CLI_NAME
= 'cate'¶ Name of the Cate CLI executable (=
cate
).
-
cate.cli.main.
COMMAND_REGISTRY
= [<class 'cate.cli.main.DataSourceCommand'>, <class 'cate.cli.main.OperationCommand'>, <class 'cate.cli.main.WorkspaceCommand'>, <class 'cate.cli.main.ResourceCommand'>, <class 'cate.cli.main.RunCommand'>, <class 'cate.cli.main.IOCommand'>, <class 'cate.cli.main.UpdateCommand'>]¶ List of sub-commands supported by the CLI. Entries are classes derived from
Command
class. Cate plugins may extend this list by their commands during plugin initialisation.
-
class
cate.cli.main.
DataSourceCommand
[source]¶ The
ds
command implements various operations w.r.t. datasets.-
classmethod
configure_parser_and_subparsers
(parser, subparsers)[source]¶ Configure the given parser and its sub-parsers.
- Overrides of this method must, e.g.::
- list_parser = subparsers.add_parser(‘list’, …) # … configure list_parser here, and finally set its “sub_command_function” like so: list_parser.set_defaults(sub_command_function=cls._execute_list)
Sub-command functions shall raise a
CommandError
instance on failure.Parameters: - parser – The command parser to configure.
- subparsers – A factory for sub-command parsers.
-
classmethod
parser_kwargs
()[source]¶ Return parser keyword arguments dictionary passed to a
argparse.ArgumentParser(**parser_kwargs)
call.For the possible keywords in the returned dictionary, refer to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.
Returns: A keyword arguments dictionary.
-
classmethod
-
class
cate.cli.main.
IOCommand
[source]¶ The
io
command implements various operations w.r.t. supported data and file formats.-
classmethod
configure_parser_and_subparsers
(parser, subparsers)[source]¶ Configure the given parser and its sub-parsers.
- Overrides of this method must, e.g.::
- list_parser = subparsers.add_parser(‘list’, …) # … configure list_parser here, and finally set its “sub_command_function” like so: list_parser.set_defaults(sub_command_function=cls._execute_list)
Sub-command functions shall raise a
CommandError
instance on failure.Parameters: - parser – The command parser to configure.
- subparsers – A factory for sub-command parsers.
-
classmethod
parser_kwargs
()[source]¶ Return parser keyword arguments dictionary passed to a
argparse.ArgumentParser(**parser_kwargs)
call.For the possible keywords in the returned dictionary, refer to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.
Returns: A keyword arguments dictionary.
-
classmethod
-
class
cate.cli.main.
OperationCommand
[source]¶ The
op
command implements various operations w.r.t. operations.-
classmethod
configure_parser_and_subparsers
(parser, subparsers)[source]¶ Configure the given parser and its sub-parsers.
- Overrides of this method must, e.g.::
- list_parser = subparsers.add_parser(‘list’, …) # … configure list_parser here, and finally set its “sub_command_function” like so: list_parser.set_defaults(sub_command_function=cls._execute_list)
Sub-command functions shall raise a
CommandError
instance on failure.Parameters: - parser – The command parser to configure.
- subparsers – A factory for sub-command parsers.
-
classmethod
parser_kwargs
()[source]¶ Return parser keyword arguments dictionary passed to a
argparse.ArgumentParser(**parser_kwargs)
call.For the possible keywords in the returned dictionary, refer to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.
Returns: A keyword arguments dictionary.
-
classmethod
-
class
cate.cli.main.
PluginCommand
[source]¶ The
pi
command lists the content of various plugin registry.-
classmethod
configure_parser_and_subparsers
(parser, subparsers)[source]¶ Configure the given parser and its sub-parsers.
- Overrides of this method must, e.g.::
- list_parser = subparsers.add_parser(‘list’, …) # … configure list_parser here, and finally set its “sub_command_function” like so: list_parser.set_defaults(sub_command_function=cls._execute_list)
Sub-command functions shall raise a
CommandError
instance on failure.Parameters: - parser – The command parser to configure.
- subparsers – A factory for sub-command parsers.
-
classmethod
parser_kwargs
()[source]¶ Return parser keyword arguments dictionary passed to a
argparse.ArgumentParser(**parser_kwargs)
call.For the possible keywords in the returned dictionary, refer to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.
Returns: A keyword arguments dictionary.
-
classmethod
-
class
cate.cli.main.
ResourceCommand
[source]¶ The
res
command implements various operations w.r.t. workspaces.-
classmethod
configure_parser_and_subparsers
(parser, subparsers)[source]¶ Configure the given parser and its sub-parsers.
- Overrides of this method must, e.g.::
- list_parser = subparsers.add_parser(‘list’, …) # … configure list_parser here, and finally set its “sub_command_function” like so: list_parser.set_defaults(sub_command_function=cls._execute_list)
Sub-command functions shall raise a
CommandError
instance on failure.Parameters: - parser – The command parser to configure.
- subparsers – A factory for sub-command parsers.
-
classmethod
parser_kwargs
()[source]¶ Return parser keyword arguments dictionary passed to a
argparse.ArgumentParser(**parser_kwargs)
call.For the possible keywords in the returned dictionary, refer to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.
Returns: A keyword arguments dictionary.
-
classmethod
-
class
cate.cli.main.
RunCommand
[source]¶ The
run
command is used to invoke registered operations and JSON workflows.-
classmethod
configure_parser
(parser)[source]¶ Configure parser, i.e. make any required
parser.add_argument(*args, **kwargs)
calls. See https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.add_argumentParameters: parser – The command parser to configure.
-
execute
(command_args)[source]¶ Execute this command.
The command’s arguments in command_args are attributes namespace returned by
argparse.ArgumentParser.parse_args()
. Also refer to to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.parse_argsexecute``implementations shall raise a ``CommandError
instance on failure.Parameters: command_args – The command’s arguments.
-
classmethod
parser_kwargs
()[source]¶ Return parser keyword arguments dictionary passed to a
argparse.ArgumentParser(**parser_kwargs)
call.For the possible keywords in the returned dictionary, refer to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.
Returns: A keyword arguments dictionary.
-
classmethod
-
class
cate.cli.main.
UpdateCommand
[source]¶ The
update
command is used to update an existing cate environment to a specific or the latest cate version.-
classmethod
configure_parser
(parser)[source]¶ Configure parser, i.e. make any required
parser.add_argument(*args, **kwargs)
calls. See https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.add_argumentParameters: parser – The command parser to configure.
-
execute
(command_args)[source]¶ Execute this command.
The command’s arguments in command_args are attributes namespace returned by
argparse.ArgumentParser.parse_args()
. Also refer to to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.parse_argsexecute``implementations shall raise a ``CommandError
instance on failure.Parameters: command_args – The command’s arguments.
-
classmethod
parser_kwargs
()[source]¶ Return parser keyword arguments dictionary passed to a
argparse.ArgumentParser(**parser_kwargs)
call.For the possible keywords in the returned dictionary, refer to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.
Returns: A keyword arguments dictionary.
-
classmethod
-
class
cate.cli.main.
WorkspaceCommand
[source]¶ The
ws
command implements various operations w.r.t. workspaces.-
classmethod
configure_parser_and_subparsers
(parser, subparsers)[source]¶ Configure the given parser and its sub-parsers.
- Overrides of this method must, e.g.::
- list_parser = subparsers.add_parser(‘list’, …) # … configure list_parser here, and finally set its “sub_command_function” like so: list_parser.set_defaults(sub_command_function=cls._execute_list)
Sub-command functions shall raise a
CommandError
instance on failure.Parameters: - parser – The command parser to configure.
- subparsers – A factory for sub-command parsers.
-
classmethod
parser_kwargs
()[source]¶ Return parser keyword arguments dictionary passed to a
argparse.ArgumentParser(**parser_kwargs)
call.For the possible keywords in the returned dictionary, refer to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.
Returns: A keyword arguments dictionary.
-
classmethod
9.9. Module cate.webapi
¶
9.10. Module cate.util
¶
9.10.1. Description¶
The cate.util
package provides application-independent utility functions.
This package is independent of other ``cate.*``packages and can therefore be used stand-alone.
9.10.2. Verification¶
The module’s unit-tests are located in
test/util and may be executed using
$ py.test test/util --cov=cate/util
for extra code coverage information.
9.10.3. Components¶
9.11. Module cate.util.cache
¶
9.11.1. Description¶
This module defines the Cache
class which represents a general-purpose cache.
A cache is configured by a CacheStore
which is responsible for storing and reloading cached items.
The default cache stores are
Every cache has capacity in physical units defined by the CacheStore
. When the cache capacity is exceeded
a replacement policy for cached items is applied until the cache size falls below a given ratio of the total capacity.
The default replacement policies are
This package is independent of other ``cate.*``packages and can therefore be used stand-alone.
9.11.2. Components¶
-
class
cate.util.cache.
Cache
(store=<cate.util.cache.MemoryCacheStore object>, capacity=1000, threshold=0.75, policy=<function _policy_lru>, parent_cache=None)[source]¶ An implementation of a cache. See https://en.wikipedia.org/wiki/Cache_algorithms
-
class
cate.util.cache.
CacheStore
[source]¶ Represents a store to which cached values can be stored into and restored from.
-
can_load_from_key
(key) → bool[source]¶ Test whether a stored value representation can be loaded from the given key. :rtype:
bool
:param key: the key :return: True, if so
-
discard_value
(key, stored_value)[source]¶ Discard a value from it’s storage. :param key: the key :param stored_value: the stored representation of the value
-
load_from_key
(key)[source]¶ Load a stored value representation of the value and its size from the given key. :param key: the key :return: a 2-element sequence containing the stored representation of the value and it’s size
-
-
class
cate.util.cache.
FileCacheStore
(cache_dir: str, ext: str)[source]¶ Simple file store for values which can be written and read as bytes, e.g. encoded PNG images.
-
can_load_from_key
(key) → bool[source]¶ Test whether a stored value representation can be loaded from the given key. :rtype:
bool
:param key: the key :return: True, if so
-
discard_value
(key, stored_value)[source]¶ Discard a value from it’s storage. :param key: the key :param stored_value: the stored representation of the value
-
load_from_key
(key)[source]¶ Load a stored value representation of the value and its size from the given key. :param key: the key :return: a 2-element sequence containing the stored representation of the value and it’s size
-
-
class
cate.util.cache.
MemoryCacheStore
[source]¶ Simple memory store.
-
can_load_from_key
(key) → bool[source]¶ Test whether a stored value representation can be loaded from the given key. :rtype:
bool
:param key: the key :return: True, if so
-
discard_value
(key, stored_value)[source]¶ Clears the value in the given stored_value. :param key: the key :param stored_value: the stored representation of the value
-
load_from_key
(key)[source]¶ Load a stored value representation of the value and its size from the given key. :param key: the key :return: a 2-element sequence containing the stored representation of the value and it’s size
-
-
cate.util.cache.
POLICY_LFU
(item)¶ Discard Least Frequently Used first
-
cate.util.cache.
POLICY_LRU
(item)¶ Discard Least Recently Used items first
-
cate.util.cache.
POLICY_MRU
(item)¶ Discard Most Recently Used first
-
cate.util.cache.
POLICY_RR
(item)¶ Discard items by Random Replacement
9.12. Module cate.util.cli
¶
-
class
cate.util.cli.
Command
[source]¶ Represents a (sub-)command of a command-line interface.
-
classmethod
configure_parser
(parser: argparse.ArgumentParser) → None[source]¶ Configure parser, i.e. make any required
parser.add_argument(*args, **kwargs)
calls. See https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.add_argumentParameters: parser ( ArgumentParser
) – The command parser to configure.
-
execute
(command_args: argparse.Namespace) → None[source]¶ Execute this command.
The command’s arguments in command_args are attributes namespace returned by
argparse.ArgumentParser.parse_args()
. Also refer to to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.parse_argsexecute``implementations shall raise a ``CommandError
instance on failure.Parameters: command_args ( Namespace
) – The command’s arguments.
-
classmethod
new_monitor
() → cate.util.monitor.Monitor[source]¶ Create a new console progress monitor.
Returns: a new Monitor instance.
-
classmethod
parser_kwargs
() → dict[source]¶ Return parser keyword arguments dictionary passed to a
argparse.ArgumentParser(**parser_kwargs)
call.For the possible keywords in the returned dictionary, refer to https://docs.python.org/3.5/library/argparse.html#argparse.ArgumentParser.
Returns: A keyword arguments dictionary.
-
classmethod
-
exception
cate.util.cli.
CommandError
(message)[source]¶ An error type signaling command-line errors.
Parameters: message – Error message
-
class
cate.util.cli.
NoExitArgumentParser
(*args, **kwargs)[source]¶ Special
argparse.ArgumentParser
that never directly exits the current process. It raises anExitException
instead.
-
class
cate.util.cli.
SubCommandCommand
[source]¶ -
classmethod
configure_parser
(parser: argparse.ArgumentParser) → None[source]¶ Add a new sub-parsers to the given parser. Call
configure_parser_and_subparsers
with the new sub-parsers.Parameters: parser ( ArgumentParser
) – The command parser to configure.
-
classmethod
configure_parser_and_subparsers
(parser, subparsers)[source]¶ Configure the given parser and its sub-parsers.
- Overrides of this method must, e.g.::
- list_parser = subparsers.add_parser(‘list’, …) # … configure list_parser here, and finally set its “sub_command_function” like so: list_parser.set_defaults(sub_command_function=cls._execute_list)
Sub-command functions shall raise a
CommandError
instance on failure.Parameters: - parser – The command parser to configure.
- subparsers – A factory for sub-command parsers.
-
classmethod
-
cate.util.cli.
run_main
(name: str, description: str, version: str, command_classes: Sequence[cate.util.cli.Command], license_text: str = None, docs_url: str = None, error_message_trimmer=None, args: Sequence[str] = None) → int[source]¶ A CLI’s entry point function.
To be used in your own code as follows:
>>> if __name__ == '__main__': >>> sys.exit(run_main(...))
Return type: int
Parameters: - name (
str
) – The program’s name. - description (
str
) – The program’s description. - version (
str
) – The program’s version string. - command_classes (
Sequence
) – The CLI commands. - license_text (
str
) – An optional license text. - docs_url (
str
) – An optional documentation URL. - error_message_trimmer – An optional callable (str)->str that trims error message strings.
- args (
Sequence
) – list of command-line arguments. If not passed, sys.argv[1:] is used.
Returns: An exit code where
0
stands for success.- name (
9.13. Module cate.util.im
¶
9.13.1. Description¶
The cate.util.im
package provides application-independent utility functions for working with tiled image pyramids.
The Cate project uses this package for implementing a RESTful web service that provides image tiles from image pyramids.
This package is independent of other cate.*
packages, but it depends on the following external packages
- numpy
- pillow (for PIL)
- matplotlib
9.13.2. Verification¶
The module’s unit-tests are located in
test/util/im and may be executed using
$ py.test test/util/im --cov=cate/util/im
for extra code coverage information.