RE: [tosca] RE: artifact processing

Very good discussion this morning.

Luc, one follow-up question related to the difference between implementation artifacts and deployment artifacts. One reason to differentiate between different types of artifacts is because they need to be treated differently. Based on your understanding, how should an orchestrator treat deployment artifacts differently from implementation artifacts?

Thanks,

Chris

From: BOUTIER, LUC [mailto:luc.boutier@fastconnect.fr]
Sent: Tuesday, January 10, 2017 8:54 AM
To: Luca Gioppo <luca.gioppo@csi.it>; Chris Lauwers <lauwers@ubicity.com>; tosca@lists.oasis-open.org
Subject: Re: [tosca] RE: artifact processing

Hi Chris, Luca and others,

First of all, I wish you a Happy new year and lot of success in your respective project and in our common TOSCA work J

Now switching back to the technical discussion, I feel personally that there is multiple subject that are not exclusive the one from the others.

My feeling is that:

Lucas is expressing TOSCA modelling best-practice in the idea that modellers should first model abstract components with all the properties/attributes/capabilities/req etc. before thinking on a specific implementation. This is for sure the first level of portability of a template that allow people to have a way to implement a component in a different way if for some reasons an orchestrator does not support the implementation they have built.

However, I think that it is a great value of TOSCA not to care only for the modelling but also to try to allow portability of the work that people are going to push in implementing the abstract components. There may be BTW multiple implementations of a given components with shell, python or extended artifacts that we don’t yet specify officially Puppet/Ansible/Chef etc.

Where I agree with Lucas is that while I think we should detail where and how a given artifact should be executed and how the inputs/outputs are provided to the artifact and fetched after the execution is complete, we should not impose on implementers how they build their orchestrators, how they connect to machines and so on as there is many concerns for that (I expressed them in a mail last year but maybe to some people from the yaml ad-hoc).

That said there is something that could I think allow people to express Chris Idea of artifact executors. Somehow as we support Shell script and Python as official artifacts people could write a wrapper in shell or python to call another type of artifact (you can call this wrapper in your own way to call python or shell artifacts). We could then think of a way to specify these artifact processors extentions that could be given to an orchestrator so he can, if he does not manage out of the box with all it’s great features (security, agents, better whatever) a given artifact he could call the processor as an usual TOSCA artifact with maybe one two additional parameters.

Now the last discussion point that was related to the artifact ‘execution target’ which is not in my opinion related to the type of artifact and should rely on the host/vs no host elements in TOSCA. Basically, an ansible artifact can be executed on the ansible master (when calling APIs for example to start an Amazon VM) or target a host (the Started Amazon VM for example) to install something on it.

If I summary my thinking what we should do is the following:

We should maybe express Lucas best practice somewhere and push people to write abstract components before implementing them in an extended type.
We should work on identification of additional artifact (Ansible, Puppet, Chef, Other ?) we want to support as implementation artifacts for TOSCA and specify how to
- Provide inptus
- Get ouputs
- Provide execution target for the execution if any is required by the “official artifact processor” (Ansible Runtime, Chef, Puppet) if applicable (host and connexion parameters).
Work on elaboration of execution ‘wrapper executor’ syntax so people could extend artifact execution through already supported artifacts.
- These wrappers won’t be required and if an orchestrator supports the artifact out of the box then there is no need for him to use a wrapper executor.

At some point of time we may have, for non-official artifacts some orchestrator that will support them and some that won’t (as long as there is no ‘wrapper executor’). As long as they are stated as ‘extention artifacts’ I think this is fine and people that want to use them know exactly the limitation they may encounter.

Luc

From: <tosca@lists.oasis-open.org> on behalf of Luca Gioppo <luca.gioppo@csi.it>
Date: Tuesday, 10 January 2017 at 11:09
To: Chris Lauwers <lauwers@ubicity.com>, "tosca@lists.oasis-open.org" <tosca@lists.oasis-open.org>
Subject: Re: [tosca] RE: artifact processing

Hi,

I'm rethinking on the artifact processing topic and I want to propose an alternative point of view.

What if we are looking at the problem with a wrong approach?

The problem of "where to process the artifact" is trying to solve the HOW the orchestrator have to work, but this is an "imperative" problem.

The real trouble that I also had was to ask myself: "I have this script in the TOSCA archive how do I instruct the orchestrator to tell it where does it have to execute it?"

The question was wrong since I should not have a script in the TOSCA archive since that is again IMPERATIVE.

We cannot mix declarative and imperative.

The real problem is that we have an oversimplified set of properties for many nodes and if we look at the "code in the script" we find many things that should be properties of nodes.

This is obviously due to the fact that we needed to have a simple example to work with, but to have a simple working example we needed to place the missing information hardcoded somewhere and this ended up in the shell script.

Probably the better solution could be to place all the needed information in the proper node (with proper relations), than the orchestrator (that could be implemented in any way) will use that info to implement the topology.

In my case I do not implement much in the orchestrator, but use an existing DevOp tool like puppet and all the properties for the node goes inside a template that is associated with the node.

I do not have any script in the tosca archive and leverage on existing tools (that I do not have to code myself and that have a wide amount of modules around for many things).

The properties I use in the nodes are very detailed (like the reverse proxy rule of the httpd.conf of apache for example), but this allows the orchestrator to use the information and do the things it likes with it and could be potentially be much more interoperable than a shell script.

In my case I use puppet and could either add the various manifest that the orchestrator dynamically generate on a puppet master and the newly created machine get the catalog and applies it or I could ssh all the files and use a puppet apply on the VM. That is the imperative work that is related to how I implemented the orchestrator and does not interest the TOSCA archive designer.

I believe that if we look at how those various DevOp tools represent for example apache we could come out with a better example of an apache node that has all the right properties that are interoperable and represent a real production installation set of information.

Than the orchestrator could also work by invoking shell scripts, but will compile the final one from a template using the provided information in the tosca file.

I believe that this should be the philosophy of TOSCA.

Luca

> _________________________________________________________________________

Il 10 gennaio 2017 alle 2.27 Chris Lauwers <lauwers@ubicity.com> ha scritto:

>

>

“Prescriptive” orchestration languages (such as Ansible or StackStorm) have already addressed the issue of how and where to process arbitrary scripts. I suggest we borrow from the approaches taken by these tools to generalize support for arbitrary artifacts in TOSCA.

ANSIBLE

1.       Ansible playbooks contain a list of plays, where each play consists of a list of tasks, each of which is executed by an Ansible “module”. Modules are intended to communicate with a remote “host” (typically specified in an inventory file) to configure that host and provision services on that host.

2.       An Ansible module is typically (but not always) a piece of python code that runs on the Ansible host. Modules expect input parameters in a certain format and return results and errors in JSON format.

3.       Ansible includes a “commands” module that is intended to execute arbitrary commands and/or scripts on a remote host. There are a number of flavors of these commands, but in general Ansible commands work in a way that is similar to how we currently envision implementing operations in TOSCA.

4.       Ansible plays have a “transport” parameter that specifies how modules associated with the play communicate with the host. Typically, the transport will be set to “ssh”, but other values are possible (for example, many network devices use “cli”, or “rest”, or “netconf”)

5.       Ansible allows the value of the “transport” variable to be set to “local”. When used with the commands module, this value indicates that the command or script to be run will be executed on the local host rather than on the host.

STACKSTORM

1.       StackStorm is built around workflows that execute “actions”. An “action” in StackStorm is an arbitrary piece of executable code (although in most cases it’s a bash script of a python script), bundled with metadata (in YAML) that specify how the script is supposed to be run by StackStorm.

2.       The main parameter of action metadata is the “runner”. StackStorm runners are part of the StackStorm platform and are responsible for “running” the script specified by the action. Runners are similar to the “artifact processors” that I proposed in my email below.

3.       StackStorm comes with a number of built-in runners. The most notable ones are:

·         local-shell-cmd - executes a Linux command on the same host where StackStorm components are running.

·         local-shell-script – executes a cript on the same hosts where StackStorm components are running.

·         remote-shell-cmd - executes a Linux command on one or more remote hosts provided by the user.

·         remote-shell-script - Actions are implemented as scripts. They run on one or more remote hosts provided by the user.

·         python-script - This is a Python runner. Actions are implemented as Python classes with a run() method. They run locally on the same machine where StackStorm components are running..

·         http-request - HTTP client which performs HTTP requests for running HTTP actions.

4.       This shows that in StackStorm, the location where the action is run is implicitly specified by selecting the type of runner, rather than by specifying a parameter value.

5.       StackStorm action metadata also specify how the script expects input values (e.g. via named or positional command line arguments).

Recommendations for TOSCA

1.       These examples show that other orchestrators give script developers a lot of options for what types of scripts can be supported, where these scripts are run, and how to connect to the “hosts” to which the orchestration applies. I suggest TOSCA should be equally flexible.

2.       I personally like the StackStorm approach better than the Ansible approach, since it is closer to what we already do in TOSCA. Lifecycle operations in TOSCA are expressed in a way that is similar to how StackStorm actions are specified. Specifically, the “inputs” section of operations are “metadata” for the script that describe the input variables expected by the script.

3.       As stated below, I recommend introducing the concept of an “artifact processor” that specifies how the artifact is supposed to be run (similar to action runners in StackStorm). This processor would be specified in a “processor” keyname under the “operation” section of a TOSCA interface. The TOSCA spec needs to include a number of built-in processors, but should also allow for development of user-provided processors.

4.       By default (and to preserve current behavior), TOSCA will use a “remote shell script” processor that uses SSH to connect to the remote host.

5.       Artifact processors may need to include a parameter that specifies how the processor connects to the host (similar to the “transport” parameter in Ansible). Alternatively, different processor types could be introduced for different types of transport.

6.       If we introduce the concept of “artifact processor”, then it’s not clear if there is any value in also specifying artifact types (since presumably artifact processors would need to specify mime types and/or file extensions of the artifacts they are able to process).

Thanks,

Chris

From: tosca@lists.oasis-open.org [mailto:tosca@lists.oasis-open.org] On Behalf Of Chris Lauwers

>   Sent: Monday, December 12, 2016 11:26 AM

>   To: tosca@lists.oasis-open.org

>   Subject: [tosca] artifact processing

For tomorrow’s Simple Profile meeting, I suggest we keeping thinking about how to “formalize” mechanisms that describe how artifacts need to be processed.

Just to recap: most (if not all) of the prose in the document uses examples where artifacts are “install scripts” that need to be run on a “Host”, where a host is assumed to be a Compute node that is the target of a HostedOn relationship.

However, in practice we need to be able to handle artifacts other than install scripts. I can think of the following four different types of artifacts (there may be others):

Install scripts: like the install scripts just described
API scripts: scripts that “deploy” nodes by making API calls to an external entity (e.g. Python scripts that call OpenStack or OpenDaylight APIs)
Playbooks/recipes (e.g. Ansible playbooks, or Chef recipes)
Images: “snapshots” of deployed entities.

Each of these types of artifacts requires a different mechanism for getting the artifact deployed. Said a different way, each of these types of artifacts may need to get “processed” differently. This means that in order to fully specify operations, we can’t just specify the artifact for the operation, we also need to be clear about the processor that is needed to process that artifact:

                operation: <artifact> + <artifact processor>

Flexible artifact processing, then, requires the following:

Specifying the type of processor required for the artifact
Specifying any configuration parameters for the artifact
Specifying tenant/user-specific parameters for the artifact

Specifying the type of processor

Ideally, each type of artifact would have a unique artifact processor, which would allow us to “standardize” on artifact processors based on the type of artifact. However, how do we handle similar artifact that can belong to multiple types, for example:

A Python script could be an install script to be run on a Host
A Python script could be an API script to be run by the Orchestrator

If we statically “define” artifact processor types, we can’t base this on file extensions of artifact types.

Processor configuration

In order to “user” a processor, we may need configuration parameters for this processor. This could involve:

DNS names (or IP addresses) for contacting the processor (e.g. Chef servers, or API servers).

In some cases, the processor may not already be running, in which case the processor itself might need to get orchestrated (e.g. using TOSCA). In this case, the configuration parameters would be the result of the orchestration, but we would need a CSAR file representing the processor.

Tenant-Specific parameters

Some processor-related parameters may be necessary to “use” the processor, for example user credentials. We may need to specify those.

Let’s discuss if this is the “right” way to think about artifact processing, and if so how do we reflect this in the TOSCA spec.

Thanks,

Chris

tosca message