“Prescriptive” orchestration languages (such as Ansible or StackStorm) have already addressed the issue of how and where to process arbitrary scripts. I suggest we borrow from the approaches taken by these tools to generalize support for
arbitrary artifacts in TOSCA.
Ansible playbooks contain a list of plays, where each play consists of a list of tasks, each of which is executed by an Ansible “module”. Modules are intended to communicate with a remote “host” (typically specified in an inventory
file) to configure that host and provision services on that host.
An Ansible module is typically (but not always) a piece of python code that runs on the Ansible host. Modules expect input parameters in a certain format and return results and errors in JSON format.
Ansible includes a “commands” module that is intended to execute arbitrary commands and/or scripts on a remote host. There are a number of flavors of these commands, but in general Ansible commands work in a way that is similar to how
we currently envision implementing operations in TOSCA.
Ansible plays have a “transport” parameter that specifies how modules associated with the play communicate with the host. Typically, the transport will be set to “ssh”, but other values are possible (for example, many network devices
use “cli”, or “rest”, or “netconf”)
Ansible allows the value of the “transport” variable to be set to “local”. When used with the commands module, this value indicates that the command or script to be run will be executed on the local host rather than on the host.
StackStorm is built around workflows that execute “actions”. An “action” in StackStorm is an arbitrary piece of executable code (although in most cases it’s a bash script of a python script), bundled with metadata (in YAML) that specify
how the script is supposed to be run by StackStorm.
The main parameter of action metadata is the “runner”. StackStorm runners are part of the StackStorm platform and are responsible for “running” the script specified by the action. Runners are similar to the “artifact processors” that
I proposed in my email below.
StackStorm comes with a number of built-in runners. The most notable ones are:
local-shell-cmd - executes a Linux command on the same host where StackStorm components are running.
local-shell-script – executes a cript on the same hosts where StackStorm components are running.
remote-shell-cmd - executes a Linux command on one or more remote hosts provided by the user.
remote-shell-script - Actions are implemented as scripts. They run on one or more remote hosts provided by the user.
python-script - This is a Python runner. Actions are implemented as Python classes with a run() method. They run locally on the same machine where StackStorm components are running..
http-request - HTTP client which performs HTTP requests for running HTTP actions.
This shows that in StackStorm, the location where the action is run is implicitly specified by selecting the type of runner, rather than by specifying a parameter value.
StackStorm action metadata also specify how the script expects input values (e.g. via named or positional command line arguments).
Recommendations for TOSCA
These examples show that other orchestrators give script developers a lot of options for what types of scripts can be supported, where these scripts are run, and how to connect to the “hosts” to which the orchestration applies. I suggest
TOSCA should be equally flexible.
I personally like the StackStorm approach better than the Ansible approach, since it is closer to what we already do in TOSCA. Lifecycle operations in TOSCA are expressed in a way that is similar to how StackStorm actions are specified.
Specifically, the “inputs” section of operations are “metadata” for the script that describe the input variables expected by the script.
As stated below, I recommend introducing the concept of an “artifact processor” that specifies how the artifact is supposed to be run (similar to action runners in StackStorm). This processor would be specified in a “processor” keyname
under the “operation” section of a TOSCA interface. The TOSCA spec needs to include a number of built-in processors, but should also allow for development of user-provided processors.
By default (and to preserve current behavior), TOSCA will use a “remote shell script” processor that uses SSH to connect to the remote host.
Artifact processors may need to include a parameter that specifies how the processor connects to the host (similar to the “transport” parameter in Ansible). Alternatively, different processor types could be introduced for different types
If we introduce the concept of “artifact processor”, then it’s not clear if there is any value in also specifying artifact types (since presumably artifact processors would need to specify mime types and/or file extensions of the artifacts
they are able to process).
For tomorrow’s Simple Profile meeting, I suggest we keeping thinking about how to “formalize” mechanisms that describe how artifacts need to be processed.
Just to recap: most (if not all) of the prose in the document uses examples where artifacts are “install scripts” that need to be run on a “Host”, where a host is assumed to be a Compute node
that is the target of a HostedOn relationship.
However, in practice we need to be able to handle artifacts other than install scripts. I can think of the following four different types of artifacts (there may be others):
Install scripts: like the install scripts just described
API scripts: scripts that “deploy” nodes by making API calls to an external entity (e.g. Python scripts that call OpenStack or OpenDaylight APIs)
Playbooks/recipes (e.g. Ansible playbooks, or Chef recipes)
Images: “snapshots” of deployed entities.
Each of these types of artifacts requires a different mechanism for getting the artifact deployed. Said a different way, each of these types of artifacts may need to get “processed” differently.
This means that in order to fully specify operations, we can’t just specify the artifact for the operation, we also need to be clear about the processor that is needed to process that artifact:
operation: <artifact> + <artifact processor>
Flexible artifact processing, then, requires the following:
Specifying the type of processor required for the artifact
Specifying any configuration parameters for the artifact
Specifying tenant/user-specific parameters for the artifact
Specifying the type of processor
Ideally, each type of artifact would have a unique artifact processor, which would allow us to “standardize” on artifact processors based on the type of artifact. However, how do we handle similar
artifact that can belong to multiple types, for example:
A Python script could be an install script to be run on a Host
A Python script could be an API script to be run by the Orchestrator
If we statically “define” artifact processor types, we can’t base this on file extensions of artifact types.
In order to “user” a processor, we may need configuration parameters for this processor. This could involve:
DNS names (or IP addresses) for contacting the processor (e.g. Chef servers, or API servers).
In some cases, the processor may not already be running, in which case the processor itself might need to get orchestrated (e.g. using TOSCA). In this case, the configuration parameters would
be the result of the orchestration, but we would need a CSAR file representing the processor.
Some processor-related parameters may be necessary to “use” the processor, for example user credentials. We may need to specify those.
Let’s discuss if this is the “right” way to think about artifact processing, and if so how do we reflect this in the TOSCA spec.