OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

tosca message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [tosca] Groups - TOSCA Operational Modalities uploaded


Relaying to the ‘tosca’ distribution list

 

From: Bruun, Peter Michael (CMS RnD Orchestration) <peter-michael.bruun@hpe.com>
Sent: Monday, December 6, 2021 10:11 AM
To: Tal Liron <tliron@redhat.com>; tosca@lists.oasis-open.org
Cc: Chris Lauwers <lauwers@ubicity.com>
Subject: RE: [tosca] Groups - TOSCA Operational Modalities uploaded

 

Some comments (please relay to the rest of the group) on “TOSCA Operational Modalities”.

 

Concerning the “Orchestration Centrifuge”, I am concerned:

 

  • If event flows turn circular, die out too quickly, or take a long time to converge, it is complicated to debug or analyze where the problem is; and it is even harder to determine how to do something about it. I am aware that there are tools for this, but as Telco systems scale to topologies with 10s of thousands of nodes, it becomes unmanageable.
  • Why is “Bandwidth Scaler” a special “hard-coded” entity in the diagram? My guess is that things like “bandwidth” are not represented as anything the cloud-native management systems (K8, etc) would know about. So it pops out as requiring a special platform. Real-life there can be hundreds to thousands of such “special” entity types.

 

I have (unfortunately) tested this idea of autonomously collaborating subsystems without central coordination of events at production scale, and customers were not terribly happy.

 

I have even tried this in various scenarios both at low and high level in the slide 3 pyramid. The critical problems are:

  • There is a high risk of circular event storms, that perpetuate themselves without ever converging to a stable state
  • Even when there is convergence towards a stable state, the time to converge for N collaborating subsystems tend to grow as N^2 due to ripple effects. This means that the time to set up a topology of hundreds or thousands of nodes becomes unacceptable, as do the required resources in terms of event processing and network capacity during the convergence period.

 

So I am fully aware of this beautiful dream, but in my experience, it does not scale to Telco grade topologies.

 

There needs to be central coordination in order to ensure a convergent process, and that, in my experience, is exactly the role of the orchestrator. It may be, that this is what slide 2 is expressing – but that is not clear to me.

 

Peter

 

 

From: tosca@lists.oasis-open.org [mailto:tosca@lists.oasis-open.org] On Behalf Of Tal Liron
Sent: 6. december 2021 17:11
To: tosca@lists.oasis-open.org
Subject: [tosca] Groups - TOSCA Operational Modalities uploaded

 

Submitter's message
I'm uploading a new version of this presentation with two extra slides that I hope will assist in explaining the previous two.
-- Tal Liron

Document Name: TOSCA Operational Modalities


No description provided.
Download Latest Revision
Public Download Link


Submitter: Tal Liron
Group: OASIS Topology and Orchestration Specification for Cloud Applications (TOSCA) TC
Folder: Working Documents
Date submitted: 2021-12-06 08:11:05
Revision: 1

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]