Relaying to the ‘tosca’ distribution list
From: Bruun, Peter Michael (CMS RnD Orchestration) <peter-michael.bruun@hpe.com>
Sent: Monday, December 6, 2021 10:11 AM
To: Tal Liron <tliron@redhat.com>; tosca@lists.oasis-open.org
Cc: Chris Lauwers <lauwers@ubicity.com>
Subject: RE: [tosca] Groups - TOSCA Operational Modalities uploaded
Some comments (please relay to the rest of the group) on “TOSCA Operational Modalities”.
Concerning the “Orchestration Centrifuge”, I am concerned:
-
If event flows turn circular, die out too quickly, or take a long time to converge, it is complicated to debug or analyze where the problem is; and it is even harder to determine how to do something
about it. I am aware that there are tools for this, but as Telco systems scale to topologies with 10s of thousands of nodes, it becomes unmanageable.
-
Why is “Bandwidth Scaler” a special “hard-coded” entity in the diagram? My guess is that things like “bandwidth” are not represented as anything the cloud-native management systems (K8, etc) would
know about. So it pops out as requiring a special platform. Real-life there can be hundreds to thousands of such “special” entity types.
I have (unfortunately) tested this idea of autonomously collaborating subsystems without central coordination of events at production scale, and customers were
not terribly happy.
I have even tried this in various scenarios both at low and high level in the slide 3 pyramid. The critical problems are:
-
There is a high risk of circular event storms, that perpetuate themselves without ever converging to a stable state
-
Even when there is convergence towards a stable state, the time to converge for N collaborating subsystems tend to grow as N^2 due to ripple effects. This means that the time to set up a topology
of hundreds or thousands of nodes becomes unacceptable, as do the required resources in terms of event processing and network capacity during the convergence period.
So I am fully aware of this beautiful dream, but in my experience, it does not scale to Telco grade topologies.
There needs to be central coordination in order to ensure a convergent process, and that, in my experience, is exactly the role of the orchestrator. It may be,
that this is what slide 2 is expressing – but that is not clear to me.
Peter
From:
tosca@lists.oasis-open.org [mailto:tosca@lists.oasis-open.org]
On Behalf Of Tal Liron
Sent: 6. december 2021 17:11
To: tosca@lists.oasis-open.org
Subject: [tosca] Groups - TOSCA Operational Modalities uploaded
Submitter's message
I'm uploading a new version of this presentation with two extra slides that I hope will assist in explaining the previous two.
-- Tal Liron