Thank you, Calin, for taking a first stab at this crucial topic!
I'd like to try to clarify and expand on some comments I made in the discussion today. Prepare for a long email. :)
There
are many different flavors of handling the interplay of re-usability
and contracts, each with far-reaching implications and vitally different
rationales. Let's examine a few representative examples.
Consider
a relatively strict flavor: Java. Java's modus operandi is centered on
design-time adherence to interfaces and signatures. Java won't let you
compile a class unless it implements the method signatures of its
interfaces, including all declared thrown exceptions, and instance
casting is enforced at the byte-code level. This strictness has given
Java a reputation for reliability: users have the reassurance of knowing
that implementations match specifications. If you can't improve on code
by extending an interface, then you have to create a new interface,
e.g. MyInterfaceVersion2, and either support both or provide runtime
warnings to users. This design-time strictness also allows for powerful
code refactoring with IDEs.
But, this
strictness also makes it very hard to change large, integrated systems
comprising interacting parts. Essentially, in Java the compilation phase
functions like a required unit test. One consequence is that there is a lot of "DLL hell" in
Java, whereby
changes to one library can cause cascading compilation breakage. This
has given the language a reputation for inflexibility and slow
development progress. It's often very hard to test out ideas without
changing a lot of code, and because of the compilation checks this means
that you might have to change code in 3rd-party libraries. We have seen
the Java community, and the
language itself, evolve to work around the strictness. For example,
annotations (and IoC generally) have allowed for "dynamic" formation and
testing of contracts at runtime, turning it from a design issue to a
configuration issue. Meta-systems like OSGi and Java 8 modules have separated contracts
from code itself. But it's all still complicated and difficult.
And
now consider Python: a "dynamic" object-oriented language that doesn't
have interfaces, allows for multiple inheritance, runtime
monkey-patching of classes, etc. This doesn't mean that strictness is
impossible, it just means that it's in the hands of the programmer: it's
up to you to decide what a contract means and when and how to enforce
it. It could be by providing a set of compliance tests, doing runtime
checks, and there indeed are libraries that can help. This
inherent looseness has given Python a reputation for speed of
development at the expense of reliability. Breakage can often be
discovered only in runtime, and only in certain situations. There's no
compiler to provide that initial sanity check.
And
now consider Go, which is somewhere in between. Go is strictly typed,
in some ways more so than Java. It also relies strongly on interfaces
and signatures and does runtime byte-code checks. But, it does not do
design-time checks for interfaces. You can declare an interface anywhere
and anytime and even name it whatever you want. You can declare it
repeatedly in different codebases that do not have to know about each
other. And you don't even have to declare its implementation: you just
implement it. It's an ad-hoc contract, but the language runtime does
provide a mechanism to enforce it. This balance
ends up being very practical, allowing the "Java advantage" to be used
when needed. We don't get the powerful code refactoring abilities of
Java, but the pros could very well outweigh the cons. And Go does
something else that's different from both Java and Python: it is not
object-oriented. Contracts are made *only* of interfaces and
signatures, not "classes", and there is indeed no inheritance built into
the language. Re-usability is in the hands of the programmer: you can
assemble types together to create something that looks like inheritance,
but Go won't dictate how. (Note: Haskell could also be used as an
example here, but it's harder to compare it to Java and Python.)
Before I go too far with this analogy, let's underscore that TOSCA is not a programming language, but a
modeling language. The contracts thus involve three parties: 1) the
modeler, who creates a set of types ("profile") to represent cloud features, 2) the architect, who assembles
those models into topologies that represent resources, and 3) the orchestrator,
which makes
them a reality. The ideal constructs are, respectively: 1)
types,
2) templates, and 3) instances. The complexity we are dealing with in
this discussion involves the fact that the architect is also a bit of a
modeler in that the architect might also need to declare types. This is
necessary because of TOSCA's design, specifically how node templates
must adhere to their node type. And because the architect's node type
must make use of modeler's types, the rules of derivation become so
critical to the architect's work. So I think the better we understand
the relationship between the modeler and architect roles, the clearer
we'll understand how to formulate derivation rules for TOSCA.
I'll try to take a stab at this. Where do strict
contracts help us? They are *necessary* between the modeler and the
orchestrator, because models are only useful insofar as they can be
implemented. They are also necessary between the modeler and the
architect, because the architect's palette can only use realistic
models, which can only be used together in specific ways. Any strictness
that exceeds those rationales will be limiting, even crippling.
A few years ago I worked out
my own version of a "TOSCA 2.0" vision.
I wasn't intentionally channeling Go, but there is some convergence.
One argument I make there that could be relevant to this discussion is
how to think about contracts. Without making the sweeping changes I advocate there (I argue against an object-oriented
approach in TOSCA),
let's think of this in terms of TOSCA 1.X. Specifically, consider this:
capability types are much more important than node types. A TOSCA node
template must have a node type, and that node type can have any number
of capabilities. And it's the capabilities that are important in the
creation the topology graph: requirements are the "plug" that connects
to the capability "socket". The capability is the
*actual*
contract here: the node type is really just a container for multiple
contracts. Relatedly, I have argued elsewhere that the best kinds of
TOSCA profiles don't have many node types, or indeed none at all.
Instead they define many capability types and associated relationship
types. Service template architects can then create their own node types
that are in effect "assemblages" of capabilities. They do not have to
worry about inheriting a base node type. They don't need to "deprecate" a
capability -- they simply don't include it in their custom node type.
This matches how real-world hardware and software resources work, in
that they can do some things and not other things. A computer might have
a GPU or might not. As would a VM, or a container. A network might have
IPv6 capabilities but no IPv4. A VNF might be able to handle both
routing and firewalls but not VPNs. A PNF could have the same set of
abilities, as would a CNF, despite lacking others. A database server
might support relational databases, or graph databases, or both. You
could handle these distinctions by creating various node types for all
combinations -- NetworkWithBothIPv6andIPv4, NetworkWithOnlyIPv6, etc. --
but obviously the capability construct is the right way to do this. The
node type is also the container for other kinds of contracts, too --
interfaces and notifications. But there, too, you can add whichever
interfaces and notifications are relevant to your custom type, so it's
still an "assemblage" in this sense. Another way to put this: the node
type is a really just a syntactical convenience, rather than an
architectural reality. The reality is what the node is able to do
(capabilities), including in relation to other nodes. You could imagine
TOSCA without node types, just node templates where you declare
capabilities, interfaces, and notifications. It's more verbose, but
could be functionally the same.
What
I am arguing for here is that strict derivation rules are important for
capability types, relationship types, and interface types, because they
are the critical contracts. But they are not useful for node types,
which is where architects need to do their work out of grammatical
necessity. I would like to see a complete overhaul of TOSCA grammar (see
my "vision"), but barring that, I think it's important that we allow
for node type derivation to be non-strict: to allow overriding anything
with anything, essentially treating it more like a re-usable assemblage ("copy-paste")
than a hierarchical type. Let architects do their work (Ã la Python) without fighting
the language (Ã la Java).
This will have
implications for other parts of TOSCA, because it would mean that
we can no longer rely on a node type as a contract. Examples: we shouldn't have
"valid_source_types" in capability types; we shouldn't have "node"
(which is a node type) in requirement definitions (although you could
specify a node template name in requirement assignments); and
substitution mapping should not specify a node type, but instead just
have mappings of capabilities, requirements, interfaces, etc. In tandem
with these changes, we should also beef up how we specify
requirements. The current "node_filter" works on properties. But what's
really important is which contracts are provided by the node. There
should be a way to say that the target node should have capabilities X,
Y, Z (or derivatives) and/or certain interfaces/notifications.
Sorry for being so verbose! I hope this will contribute to the discussion. What I'm trying to say is that before formulating derivation rules we should understand the basic contracts we want the language to handle.