RE: [tosca-comment] Identifiers in reference to multiplicity discussions

Tal,

My suggestion would be that by default a node_template refers to one and only one node instance in the real world. That way there is a clear distinction between node type definitions (which are the specification) and node templates (which are instances).

That default position would be overridden where there would otherwise be a need to write out in full multiple node templates which are all derived from the same node type definition and differ only in parameter value assignments. In effect the’ occuances’ syntax would act as a loop over node template.

The syntax for creating that loop would include mandatory indicator of which parameter or attribute definition (defined in the node type def) is to be used as the ID for items in the range. The assigned value for that ID would only need to be unique within the scope of the cluster and must be invariant for the lifetime of the instance.

In many case the ID value will be assigned by the system or the orchestrator in which case the ID would be defined as an attribute. In other cases the ID value would be assigned to a property. Property assignments would need a function which could be evaluated for each cluster member. TOSCA currently has very few intrinsic functions and those which exist are unlikely to be enough for this purpose. The availability of more functions is one area where HELM currently has more functionality than TOSCA. property_value_expression does not seem to be defined at all at the moment and I’m not sure how a processor would definitively distinguish between an property value assignment and an _expression_.

Rather than defining a programming language syntax within TOSCA I believe it would be preferable to allow breakout to an existing language. I think that that puccini has this ability.

I agree with you that the range ID will need to be qualified to make it unique within the template so that the cluster members can be referenced from elsewhere in the template but think we should make it mandatory that the qualifier is the <modelable_entity_name> (which is what you happen to have chosen in your server example). This departs less from the current usage for single nodes.

As for select statement, I very much dislike the way that we drift off into implementation specific syntax at certain points. I would much prefer that the query language used be explicitly declared. The same comment applies to the syntax used for schema definitions. I don’t think these language declarations need to be included each time a select directive or a schema statement occurs in the template, instead I suggest that the TOSCA header includes a statement which defines the context for the whole document.

tosca_definitions_version: tosca_2_0

schema_syntax_context:

path: org.json-schema/specification.html

version: 2019-09

select_syntax_context:

path: net.goessner/articles/JsonPath/

version: 2007-02-21| e1

function_syntax_context:

path: org.golang

version: 1.16

node_types:

Server:

properties:

hostname:

type: string

attributes:

current_ram:

type: scalar-unit.size

topology_template:

inputs:

os:

type: string

default: linux

inputs:

numberOfServersInCluster:

type: integer

node_templates:

server:

type: Server

occurrences:

identifier: hostname

limit: [1, UNBOUNDED]

instance_count: { get_input: numberOfServersInCluster }

properties:

hostname: [naming_function_in_golang]

outputs:

ram_use_for_named_server:

get_attribute [server, current_ram, hostname_one ]

ram_use_array_all_servers:

select: [$.server..current_ram] ## a JSON Path query

ram_use_array_selected_:

select: [$.server.????.current_ram] ## a JSON Path query but how to pass in the os input? In what order are the different languages substituted/processed?

ram_use_sum:

??? Some function for summing the results of the array. In what order are the different languages substituted/processed?

Paul Jordan
OSS Specialist
BT Technology | Tel +44 (0) 3316252643 | paul.m.jordan@bt.com

This email contains information from BT that might be privileged or confidential. And it's only meant for the person above. If that's not you, we're sorry - we must have sent it to you by mistake. Please email us to let us know, and don't copy or forward it to anyone else. Thanks.

We monitor our email systems and may record all our emails.
British Telecommunications plc
R/O : 81 Newgate Street, London EC1A 7AJ
Registered in England: No 1800000

From: Tal Liron <tliron@redhat.com>
Sent: 17 February 2021 19:12
To: Jordan,PM,Paul,TNK6 R <paul.m.jordan@bt.com>; tosca-comment@lists.oasis-open.org
Subject: Re: [tosca-comment] Identifiers in reference to multiplicity discussions

Thanks Paul, we started to discuss this challenge in depth in the ad-hoc meeting.

In my view, you're on the right track. For specific systems that need to identify node instances we can use attributes. That allows the values to be filled by an external system, whether it's the platform itself or an orchestrator that manages IDs. TOSCA would then let you model the data type for that identifying attribute as is appropriate.

The problem is that we still don't entirely understand what attributes are in TOSCA. :) Even more specifically we don't have tools for using attributes as unique identifiers.

Here's one possible approach:

We add a keyword called "unique" (boolean) to attribute declarations. When "unique: true" is set on an attribute it is a signal to orchestrators that multiple instances (whatever that would mean in any specific implementation) would require unique identification management for this attribute. How this is done would be out of scope for TOSCA, and indeed in many cases would be handled by the platform itself, e.g. you spin up a virtual machine and get a GUID after it's created. Is it not a GUID, but rather an ID that is unique only per cluster? Then maybe your attribute needs to comprise a combination of cluster ID and resource ID in order to be unique. You can model that easily in TOSCA. Note that it might also be possible to have multiple attributes marked as unique. Why not? There might be different unique IDs coming from different parts of the system but all refer to the same "node" as you've encapsulated it in TOSCA.

So why have the keyword at all? Well, now that we know this attribute is unique we can use it as a grammatical reference in functions, specifically the get_attribute function (but also get_artifact and possibly others). Right now get_attribute uses either a node template name or magic keyword (SELF, SOURCE, TARGET) as the "modelable entity name", but I think we would all agree that this is poorly defined. A unique ID attribute can help us narrow what we mean. In trying to brainstorm, here's something I came up with:

A new intrinsic function called "select". This implementation-specific function can search through the runtime or orchestration universes and return a list of IDs. This result (the list of IDs) can then be used as the "modelable entity name" for get_attribute. An example:

data_types:

ID:

properties:

cluster_name:

type: string

serial_number:

type: integer

node_types:

Server:

attributes:

identifier:

type: ID

unique: true

current_ram:

type: scalar-unit.size

topology_template:

inputs:

os:

type: string

default: linux

node_templates:

server:

type: Server

outputs:

total_ram_use: { get_attribute: [ { select: [ server, identifier, "where os = ", { get_input: os } ] }, current_ram }

The "select" function here has the following arguments: first argument is template name (or SELF, SOURCE, TARGET) and the second is the name of an attribute that must be marked as "unique: true". The rest of the arguments will remain as implementation specific. In this example I'm assuming an orchestrator that has some kind of textual querying language. The get_attribute function then uses the result of this function as its first argument, which would be zero or more instance IDs of that specific node template. For those instances it would extract the "current_ram" attribute. What would the result be? I think it should be a dict of ID mapped to attribute value. So that output can then be used by other parts of orchestration to calculate averages, create a total sum, etc.

And those IDs are also consistent. So if, for example, you have multiple outputs with different "select" queries and different attributes, well, that's fine, somewhere down the toolchain you are guaranteed that they refer to the same instance of that node template, so you can cross-reference and construct whatever totals or graphs you want from those outputs.

On Wed, Feb 17, 2021 at 6:01 AM <paul.m.jordan@bt.com> wrote:

I see in the TOSCA TC mailing list a discussion about multiplicity. Good that has been a an area of confusion for me.

In particular an email from Peter Bruun about node identities.

I’d just like to mention that in my work on mapping TMForum to TOSCA I have a profile where every one of my node types is derived from a root type which represents the SID RootEntity. The SID RootEntity is defined as having a mandatory Name and ID plus an optional description thus.

ID	RootEntity	required	Unambiguously distinguishes different object instances.
description	RootEntity		This is a string, and defines a textual free-form description of the object. Notes: This attribute doesn’t exist in M.3100. The CIM has two attributes for this purpose, Caption (a short description) and Description.
name	RootEntity	required	Represents a user-friendly identifier of an object. It is a (possibly ambiguous) name by which the object is commonly known in some limited scope (such as an organization) and conforms to the naming conventions of the country or culture with which it is associated. It is NOT used as a naming attribute (i.e., to uniquely identify an instance of the object).

Paul Jordan
OSS Specialist
BT Technology | Tel +44 (0) 3316252643 | paul.m.jordan@bt.com

We monitor our email systems and may record all our emails.
British Telecommunications plc
R/O : 81 Newgate Street, London EC1A 7AJ
Registered in England: No 1800000

tosca-comment message