Re: [tosca] RE: updated operational model

Thanks Peter,

There's almost too much to unpack here. I'll try to be as succinct as possible.

I absolutely want us to address Day 2. I just don't think designing an orchestrator is the way to achieve that goal for TOSCA. And it's not the right thing for us to do as a group comprising companies that make orchestrators that sometimes compete. I joined the TOSCA TC to design a language, not another product.

Day 2 is hard. Not just for TOSCA, but for any orchestration scenario. For a service to truly be "fire and forget" (are you sure you really mean "forget" here? more on that below) it means that the Day 0 design should already have precise information (rules) on how the service can change, and also why it would change (transition states). The key to this, in my view, is policies. And this includes rich policy types, included in the profiles, that exactly expose what rules are possible for the specific domain or platform. I suspect that a true overhaul of TOSCA policies would have to wait for TOSCA 2.1 or beyond. It seems like far more than we can chew right now.

But there are some high-level and useful things we can say right now that would also be true for a later version of TOSCA.

1) The normalized representations are the common ground of truth. It might be "late truth" (eventual consistency) or "early truth" (not yet instantiated), but that's the only way our operational model can make any kind of sense. Of course they are persisted, Peter. I'm not sure how you understood that I was suggesting otherwise. I was specifically pointing out that the database may already exist. Terraform is based around a powerful, cloud-aware, transactional db. Ansible has its generic "facts" and "inventories". Kubernetes has etcd. If you are using these but want to create yet another database, then that's up to you. I'm not in the market for a new orchestrator-of-orchestrators at the moment. I'm interested in TOSCA as a common language for those I already have.

2) Where do these representations come from? From the TOSCA perspective we can say that TOSCA generates some of them. But we can expect others to come from other sources. Importantly, TOSCA doesn't necessarily "own them", even those that it generates. Who owns them? Sometimes it is the orchestrator -- e.g. a lifecycle manager for virtual machines, each with a UUID, per your example. In other cases (in many cases) it is a controller. For example, in Kubernetes it is rare to create a Pod directly. Rather, one would create a Deployment. The Deployment owns a ReplicaSet, which in turn is the owner of the Pods: it is the one that maintains the Pod IDs. And yet both Deployments and Pods are represented in etcd. They are "representations". How we model all this is up to us. We can create models for all of these representations, one node template for each. Or we can decide on, say, a single higher-level "Compute" node template, which behind the scene has these several representations, with only the Deployment ID being necessary for grammatical association. (As I say it's "zero or more representations" per node template.)

3) There's some limited, but important, Day 2 stuff in TOSCA already. A TOSCA attribute only makes sense in Day 2. That is why "forget", from "fire and forget", made me a bit uneasy. If it's truly "forgotten" then it's hard to see TOSCA's operational model being applicable to anything beyond that first nanosecond of deployment. An attribute represents new state. Likewise, operations can only make sense in terms of some kind of event model, which is, again, new state. Of course the constraint here is that we are still talking about the same node templates, but again the node representations (and their IDs) might change.

4) Then there's the big changes: adding/removing nodes (and relationships) on Day 2. I'm interested in cloud native orchestration. So, yes, I emphasize that controllers would often be owning the representations and we should operate them by providing them with policies. It seems you're interested here in explicit, manual changes. You mention changing TOSCA inputs in order to create topological changes, but actually TOSCA inputs allow for very little variability. Let's be even more radical and consider that someone could manually edit the TOSCA YAML file and then want to make that change happen. That seems like a reasonable Day 2 action to me. Perhaps it might make sense for us to discuss what it would mean to re-apply a TOSCA service template to an already existing deployment and how a diff would be calculated, for example how node template names are preserved. By the way, I recommend looking at Terraform for inspiration, because this is exactly how its aggressively declarative state management works.

On Mon, Feb 14, 2022 at 2:00 AM Bruun, Peter Michael (CMS RnD Orchestration) <peter-michael.bruun@hpe.com> wrote:

Hi Tal,

Â

There is a spectrum of possible stances around the semantics of TOSCA â particularly when we begin to discuss day 2 scenarios.

Â

ÂÂÂÂÂÂÂÂÂ At one end of the spectrum is the way XML/XSLT, YAML or YANG work â pure syntax and syntactic manipulation, open for attaching any semantics you want.

ÂÂÂÂÂÂÂÂÂ At the other end of the spectrum is a specific orchestrator architecture and implementation with no room for interpretation by individual orchestrators

Â

In between there is a sliding scale of under-specification. TOSCA comes with a specified static semantics â type-checking. There is also a level of well-defined interpretation of functions and substitutions.

Â

However, when we start specifying more detail in the orchestrator side of the normalized templates we get into trouble with the inputs. For this to be TOSCA at all, you have to accept that inputs belong to what we need to define semantics for, but this raises the question: What should happen if, on day 2, inputs change.

Â

There is a possible fire-and-forget position, that once a template has been deployed, all further management and modification happens directly in the native orchestration technology â in your case that could be K8. Additionally, templates may provide a pane of glass though which to view, what is going on with the services.

Â

Here we branch out. If the actors who deploy templates are also those who authored the templates, interfaces and artifacts, then clearly they have a deep understanding of the target technology â for example containers, K8, etc. So these actors will be able to take control of their service as deployed through the interfaces of the native underlying technologies.

Â

In cases where template authors are not the same as the template users, the case is completely different, because the only way the template users know their service is by the overall functional contract of the template and the specific inputs that they provided. Internal structure, substitutions, etc. would be completely alien to such users, and so there will be a strong and unyielding requirement that *all* subsequent day 2 management and maintenance of a deployed service can happen through the use of TOSCA templates and inputs. There would be no allowable exceptions to that âallâ, because the template users do not have the skills to manage the underlying orchestration technology, let alone the target technology.

Â

Tal, if you insist that all day 2 change-management must happen through the native orchestration technologies, then you are basically limiting the use of TOSCA to fairly small domain of technology experts, and telling everybody else to find something other than TOSCA to work with.

Â

If, on the other hand, you agree, that our standardization work on TOSCA should indeed cover day 2 change-management, then the purpose of the Operational Model is to provide a *mental model* that template designers can use for understanding how their templates are going to work for their template users â and that would have to include a detailed understanding of how the template users would be allowed to change an already deployed service. For example, can they change all inputs, or only some inputs? Can they select a new template for a service that was created from another template, or will they have to work from the original template?

Â

I do agree that the Operational Model has not yet converged, and it seems to answer the question âCan the user select a different templateâ with a âNoâ â contradicting what Calin answered about versioning. But if users can select another template, then we would be forced to specify how nodes in one template map to nodes in another template (by name? Other ideas?).

Â

The Operational Model must also tell the template designer what functionalities an orchestrator would have to provide in order to allow the designer to cater for day 2 scenarios.

Â

The absolute minimum functionality for day 2 modifications would be that the orchestrator could tear down the old service and create the new one. The delta-computations are strictly a non-functional âoptimizationâ of that.

Â

What does it take for an orchestrator to be able to tear down, in the infrastructure, a previously deployed service? We note that the template user cannot be required to know about any technology specific identifications of the deployed service. I guess that there can be scenarios, where somehow the identification of the elements of the external implementation can be uniquely derived from the inputs, but there are certainly also cases, where, when an orchestrator creates an element, the underlying management system provides the ID that it would need in order to subsequently work with it.

Â

Example: If you create a VM in OpenStack, then OpenStack generates and returns a UUID. Without that UUID there is no way to subsequently remove that VM. By the way, that paradigm was designed by someone who didnât understand automation, because if that UUID is lost (e.g. due to network failure at the wrong time), then that VM would have âleakedâ as a resource.

Â

Anyway â this means, that

a)ÂÂÂÂÂ If template designer and template user roles can be assumed to be different, and

b)ÂÂÂÂÂ If TOSCA is intended to be strong enough to orchestrate infrastructures that provide attributes required for teardown/modification

Â

Then some persistent or pseudo-persistent storage (database, files, kafka-bus, â) is a required and unavoidable component in the model.

Â

Tal â you are right, that there are use-cases where a) and b) are not needed, and where therefore a database and many other functions would not be required. But without the Topology Representation being persistent, there are many uses of TOSCA that become impossible.

Â

So we need to resolve this so we can proceed. If we mandate persistence-dependent change management, we rule out use-cases that donât need it, but if we donât specify it we rule out use-cases that do need it.

Â

I suggest we identify some levels of compliance:

Â

1.ÂÂÂÂÂÂ Design only â no orchestration

2.ÂÂÂÂÂÂ Deploy only â check consistency and compile for external orchestration platform â fire-and-forget

3.ÂÂÂÂÂÂ Continued change management

a.ÂÂÂÂÂÂ Input change only

b.ÂÂÂÂÂ Map existing service to new template (or new version of template)

Â

There may be more â for example monitoring pane-of-glass without change management.

Â

That way we can be allowed to standardize features for 3 without ruling out level 1 and 2 compliant orchestrators that do not need those features.

Â

Peter

Â

Â

From: Tal Liron [mailto:tliron@redhat.com]
Sent: 12. februar 2022 23:35
To: Chris Lauwers <lauwers@ubicity.com>
Cc: Bruun, Peter Michael (CMS RnD Orchestration) <peter-michael.bruun@hpe.com>; Calin Curescu <calin.curescu@ericsson.com>; tosca@lists.oasis-open.org
Subject: Re: [tosca] RE: updated operational model

Â

Chris, without going into a detailed critique, what I see here is a design of a potential orchestrator built on TOSCA. In my opinion, this goes way beyond the scope of what TOSCA should assume. What I keep emphasizing is that not all TOSCA users intend to build a new orchestrator. We already have mature orchestrators that, for better or for worse, are not going to be discarded, nor do users feel comfortable about adding a higher-level orchestrator to bridge with TOSCA. And I'm sure that we all want them to participate in the TOSCA ecosystem.

One particular aspect that is a thorn in our discussion is those "normalized representations". The happy compromise we came up with was to put that box in between the processor and the "orchestrator/platform". We all understood that we were passing the buck here, and that the definition ended up being somewhat amorphous. I'm personally very satisfied with that. One possible implementation of the model is to indeed build a new system that stores these representations in a database (like Ubicity?). But, when dealing with existing orchestrators and platforms, they have their own mechanisms for resource identification, state management, inventory, lifecycle, etc. Specifically they already have an opinionated methodology for "normalized representations". The work of integrating those into TOSCA is to find a way to associate them with TOSCA's grammatical features, which were carefully designed to be general purpose. The common-denominator assumption is that every declarative environment will encapsulate "nodes" with both static and dynamic "properties", and that there are custom-semantic relationships between them (the topology). It won't always be a 1:1 match with TOSCA and indeed it's not expected that every cloud in existence would support all TOSCA features. For example, artifacts and operations may not have native equivalents everywhere. This is part of the reason I keep saying that a node template can be associated with "zero or more" representations. Sometimes it might be 1:1 (Terraform). Sometimes there might be multiple subsystems, different ones for different kinds of nodes. Semantics vary. The hope is that TOSCA's grammatical abstractions would make sense for the majority of implementation. I think they do. (Or, they can, unless we bloat the model with too many assumptions.)

Â

In my own suggestion for an enhancement of the operational model I more modestly added artifacts to the diagram, and also proposed a generic async event model that I don't think favors any specific semantic.

Â

On Sat, Feb 12, 2022 at 1:59 PM Chris Lauwers <lauwers@ubicity.com> wrote:

Here is another attempt at the diagram based on feedback from Peter and Calin.

Â

Thanks,

Â

Chris

Â

From: Chris Lauwers
Sent: Friday, February 11, 2022 3:12 PM
To: Bruun, Peter Michael (CMS RnD Orchestration) <peter-michael.bruun@hpe.com>; Calin Curescu <calin.curescu@ericsson.com>; tosca@lists.oasis-open.org
Subject: RE: [tosca] RE: updated operational model

Â

From: Bruun, Peter Michael (CMS RnD Orchestration) <peter-michael.bruun@hpe.com>
Sent: Thursday, February 10, 2022 9:15 AM
To: Calin Curescu <calin.curescu@ericsson.com>; Chris Lauwers <lauwers@ubicity.com>; tosca@lists.oasis-open.org
Subject: RE: [tosca] RE: updated operational model

Â

Hi Calin,

Â

I concur totally with what you propose. I am sure that several of us have actually implemented something like this, and for me it works nicely.

Yes, my implementation works exactly like that.

Â

I am, however, noting that some members object strongly to the idea of a âdependency graphâ driving orchestration. Deriving a dependency graph from Relationships is, as far as I am aware, not part of the our TOSCA standardization. So this would be completely up to the profiles to define.

Iâm not sure I understand this statement. Isnât a TOSCA topology graph (which consists of nodes and relationships) an explicit encoding of the dependencies between the various components that make up a service? If youâre using âpure TOSCAâ orchestration, then this graph absolutely drives orchestration. On the other hand, you could also ignore TOSCAâs orchestration features (such as interfaces, operations, notifications, workflows, etc.) and just hand off the âtopology representation graphâ created using TOSCA to a 3^rd-party orchestrator. Tal has now shown us three different examples of this approach (with Kubernetes, Ansible, and Terraform).

Â

The ETSI standard does define how a template can refer to a previous/subsequent version of itself, although the way it is done there is perhaps not the best design.

This statement would suggest that your earlier âdependencyâ comment refers to dependencies between different versions of the same template. Am I misunderstanding?

Â

Peter

Â

Thanks,

Â

Chris

Â

Â

---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.Â Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php

tosca message