Re: [xmile] SMILE and XMILE

Part 2.

Re: SMILE and XMILE

by Magne Myrtveit » Thu Aug 08, 2013 4:18 pm

A) To Bob's suggestion about stop using SMILE:

A language - such as a modelling language - is an abstract phenomenon, which requires a representation to become useful. The human-centric equations syntax, and the corresponding graphical syntax, are two representation of a system dynamics language, while a computer-centric XML-based document format is another representation.

This supports Bob's suggestion to work with just one name for the language.

My opinion is - for what it is worth - that the core system dynamics language (the standard) should be called something like SDL or SDCML (System Dynamics Community Model Language), while extensions to that language should be named by the respective vendors.

Since XMILE de facto is a isee language, given its elaborate definition of non-standard extensions to core SD, it should either be reserved for isee, or otherwise XMILE must be scaled down to the bone, and isee asked to pick a new name for their extended version, for example The isee modelling language, or the iThink language.

In general, it makes no sense that a vendor's language and the standard language for a community are one and the same. One way to see why, is to notice that the first extension or change this vendor makes to the language, will introduce a conflict between the standard language and the vendor's language.

(The naming and scope of XMILE belongs to a larger discussion of potentially good and bad consequences of different ways to set the standard).

B) To Bob's suggestions for layout of our discussion: I agree that we need to structure our exchange. But in my view, schemas (DTD and XSD) belong to the syntax level of the document representation of a language, i.e., quite far down in the language levels, as I described in http://www.systemdynamics.org/forum/viewtopic.php?f=21&t=350. A DTD can, if people learn how to read it, serve as a high-level definition of objects, object properties, and object relationships at the conceptual level of a model. However, it is better to use simple class diagrams or drawings at this level. (XSD is quite unreadable, and useful only at the final stages of specification, as a help for developers to create and test their implementation of the standard). Therefore, I think we can postpone schema definitions to quite late in the process.

Best regards from sunny Norway,
Magne

Magne Myrtveit: Posts: 51; Joined: Mon Jan 12, 2009 6:52 am

Top

Re: SMILE and XMILE

by Magne Myrtveit » Fri Aug 09, 2013 3:42 am

Regarding naming of the SD standard core language, it is a pity that "J" is already occupied for another computer language. It would have been great to honour the founder of the field by naming the language after him. However, "JF" too, is a nice name for a language -- not to mention JFK

Magne Myrtveit: Posts: 51; Joined: Mon Jan 12, 2009 6:52 am

Top

Re: SMILE and XMILE

by Robert Eberlein » Fri Aug 09, 2013 8:44 am

Magne,

Thanks for the comments on this and your excellent writeup on data flow modeling frameworks.

I believe that you have ultimately argued in favor of distinction between the name by which the computational language is called and that for its encoding (not to mention the encoding for visual representations and control features). It would indeed have been cool to use a moniker like J for a common computational language, but perhaps something more mundane such a SDCL for System Dynamics Computational Language would work. I guess my biggest objection to SMILE is that is always getting mixed up with XMILE.

Leaving naming aside for a moment, there is a more fundamental question of what is the clearest way of representing the underlying computational language. Dynamo and Vensim use the very traditional ordinary text representation with = for assignment as in:

A=B

In Dynamo that representation was exactly what there was, and users changed it by editing the text. In the modern SD languages, however, people rarely deal with such text. The XMILE encoding which would be

<aux name="A">
<eqn>B</eqn>
</aux>

This is probably closer to what most users see (replacing the markup with a dialog of some sort). So the question I am struggling with is whether it is best to specify the entire language in a more traditional format, then specify the encoding which is necessarily a partially parsed version of that language, or specify the language in such a way that the encoding adds only tag formality and does not require any parsing relative to the specification.

Originally I was leaning toward the former, which is not surprising given that this is exactly how Vensim works. Now, however, I am leaning somewhat more toward the latter, as it simplifies a number of issues. Two exemplars are subscripts and non-negative levels.

Using the traditional assignment representation when a variable appears with subscripts it is necessary to try to determine what those subscripts mean. Encountering "Population[1]" in an equation simply tells us that the variable "Population" has a single subscript, nothing about what the subscript is. If, however, the variable Population shows up with something like:

Level: Population
Subscripts: AgeGroup
Units: Person

and so on then "Population[1]" is perfectly well defined.

Similarly, for a non negative level we ,might have something like

Level: WorkInProcess
Outflows: Rejects, Completions
Nonnegative: Completions, Rejects

and so on. (Two important comments. 1. I completely reject the use of nonnegative level constructs because the explicitly mask the feedback that must be in place to make them come true. 2. I do not think we should restrict Level formulations to only have explicitly named input and outputs as this is too constraining especially for people (like me) who never use SMOOTH/SMTH1).

I think the current computational description (the SMILE draft) should reflect these structures more clearly. I recognize that there is a bit of a chicken and egg problem here (that the XML definition depends on the spec which is created with the XML definition in mind). And that there are some programmatic overtones (what data structures are necessary to implement will partly drive the definitions). Still, I think we can achieve clarity and have the computational specification as readable by a typical modeler as it would be with the more traditional assignment notation.

I would very much like to hear the thoughts of others on this.

Robert Eberlein: Site Admin; Posts: 157; Joined: Sat Dec 27, 2008 8:09 pm

Top

Re: SMILE and XMILE

by Magne Myrtveit » Sat Aug 10, 2013 4:47 pm

Hi Bob,

It would be great to have more people involved in this thread. Hopefully we will get company soon. Below are some comments to your recent post. (Page numbers refer to "A framework for dataflow models").

Naming

I guess my biggest objection to SMILE is that is always getting mixed up with XMILE.

Agree. If used for the SD Core Language (SDCL) the X in XMILE is also misleading, since XML is a computer-centric representation of structured contents.

Equation format

So the question I am struggling with is whether it is best to specify the entire language in a more traditional format, then specify the encoding which is necessarily a partially parsed version of that language, or specify the language in such a way that the encoding adds only tag formality and does not require any parsing relative to the specification.

A language should be defined in terms of its abstract concepts, and represented in different ways, depending on the purpose or situation. Here are some examples:

Example 1: When listing model equations as part of a document (an article, a book, or similar), the conventional mathematical equation is the preferred form:

Code: Select all: name=definition

Example 2: Modern software typically provide access to individual object properties via widgets that are part of the GUI. When this is the case, name will be edited in one widget, and definition in another. (There is no need for the software to parse the equation to separate the name part from the definition part).

Code: Select all: Name: <Enter name here> Definition: <Enter definition here>

Example 3: Internally, the software represents a variable as an object, with multiple properties. When saving the object to file, a structured representation has many advantages over plain text. XML is one possible framework to use. In my definition of the dataflowml framework, I have specified that the definition property is plan text. The reason for that, is that different vendors use significantly different syntax for the right-hand-side of equations. Vendors can provide extra XML elements for the right-hand side in addition to the text. One possibility is to store the definition as mathml, and as a collection of other XML elements, the way XMILE does it. Following dataflowml, the XML representation for an equation is like this (page 38):

Code: Select all: <m:Variable Id="..." Definition="..."> <m:Names> <m:Name>...</m:Name> </m:Names> <m:Description/> <variable/>

The XML format (like the example above) can obviously not be used for publishing equations for humans to read. We do need an equations syntax, that is clear.

Example 4: A DTD can be used to represent the structure of the XML representation of a model (see page 39). It is an "almost human-readable" format, that can be used by programmers to define language structure. (I used it when writing the XML specification of Smia models. Smia documents still contain a complete DTD and XSD defining the document format. The DTD and XSD is auto-generated by Smia's code, so they are always up-to-date when a new Smia release is made).

Structure of expressions
For the text-based representation of equations, it would be nice of we could come up with a framework (a set of keywords) that can be recommended to use. As an example, here is an example of what I suggest for stock definitions:

Code: Select all: account = stock 1000USD inflow interest inflow deposits outflow withdrawals

Subscripts

Encountering "Population[1]" in an equation simply tells us that the variable "Population" has a single subscript, nothing about what the subscript is.

The equation where population is defined, will contain the necessary information about the dimensions of Population. When the variable is used on the right-hand-side of some equation, subscripts will be checked against the definition of the subscribed variable. This will be the case no matter how the variables are represented, assuming that the equations-based representation contains as much information as any structured representation that might be used instead. To be specific, let us look at your example:

Code: Select all: Level: Population Subscripts: AgeGroup Units: Person

A single equation can capture exactly the same information, like this:

Code: Select all: Population = stock {AgeGroup | ... } as Person

Here I make use of my suggestion to use braces {...} for arrays and "as ..." for giving the type and/or unit of a variable. Note that "as Person" is needed only if the _expression_ defining the variable does not implicitly determine the data type and unit of the variable.

A problem with representing the definition of a variable, say, as a structured text, is that this would require a very elaborate structure definition, if we are going to support many vendor's dialects (and extensions) of the grammar (syntax as well as semantics) of equations.

Non-negative stocks

I completely reject the use of nonnegative level constructs

Agree 100%

Flow expressions

I do not think we should restrict Level formulations to only have explicitly named input and outputs as this is too constraining

Agree 100%. In Smia, any mathematical _expression_ can be used in the position of a flow. As mentioned on p 49 in my dataflow PDF, it is conceptually wrong to define a separate variable type for flows. It is up to each vendor, of course, to break concepts, but it should not be permitted in a standard for SD.

Best regards,
Magne

Magne Myrtveit: Posts: 51; Joined: Mon Jan 12, 2009 6:52 am

Top

Re: SMILE and XMILE

by Robert Eberlein » Sun Aug 11, 2013 5:47 am

Hi Magne,

Just one comment on this. A pretty standard equation for population in Vensim is:

Population [gender, AG00]= INTEG (
+ births[gender]- deaths[gender,AG00] + net migrating[gender,AG00],
init pop[gender,AG00]) ~~|
Population[gender,AG01andOver]= INTEG (
-deaths[gender,AG01andOver] + net migrating[gender,AG01andOver],
init pop[gender,AG01andOver])
~ person
~ |

or, if you are not using a cohort control function

Population[gender,A0]= INTEG (
births[gender]-deaths[gender,A0]-aging[gender,A0],
initial population[gender,A0]) ~~|
Population[gender,AR1To99]= INTEG (
aging[gender,AR0To98]-deaths[gender,AR1To99]-aging[gender,AR1To99],
initial population[gender,AR1To99]) ~~|
Population[gender,A100P]= INTEG (
aging[gender,A99]-deaths[gender,A100P],
initial population[gender,A100P])
~ Person
~ |

In both cases, and others like this, the subscripts on population can only be determined by knowing that the subscript elements and subranges used are part of the subscript Age. In Vensim this works because no numbers are allowed and all subscript element names must be unique names. Without that (or tagging) the models would become indeterminant in some cases.

Robert Eberlein: Site Admin; Posts: 157; Joined: Sat Dec 27, 2008 8:09 pm

Top

Re: SMILE and XMILE

by Magne Myrtveit » Mon Aug 12, 2013 4:32 am

Hi Bob,

First, allow me to correct something in my previous posting. My Example 4 does actually not belong together with the other tree examples, as DTD is a representation of a language, rather than a model. (DTD is similar to BNF in may ways). Example 4 can be replaced by stock-and-flow or causal-loop representations of models.

Also, I don't know if the point I meant to make about equations language disappeared inside my lengthy text. Restating it here: Model equations are needed when writing about models (in forums like this, for example). Equations should be complete in the sense that it must be possible for humans, and computers as well, to reconstruct and evaluate the model from its equations. It is up to each vendor to determine if the equations syntax will be used by their software also when writing models to file (the MDL (Vensim) way), or if model files use a structured representation of models (the SIP (Powersim), DPA (Dynaplan), Excel, and XMILE way). Smia has a File | Save as feature, that allows the user to store a model in binary format, XML format, or equations format (plain text or HTML).

To your statement in our most recent post:

The subscripts on population can only be determined by knowing that the subscript elements and subranges used are part of the subscript Age. In Vensim this works because no numbers are allowed and all subscript element names must be unique names. Without that (or tagging) the models would become indeterminant in some cases.

There is another solution to this problem, which is to make the full dimensions available at the right-hand side of Population's equation, and to merge multiple equations into one, using the approach shown in Figure 66, page 48, in "A framework for dataflow models". This also opens up for using any data type for dimensions; not only lists of named elements.

Let me demonstrate with the basis in your first code example, which looks like this:

Code: Select all: Population [gender, AG00]= INTEG ( + births[gender]- deaths[gender,AG00] + net migrating[gender,AG00], init pop[gender,AG00]) ~~| Population[gender,AG01andOver]= INTEG ( -deaths[gender,AG01andOver] + net migrating[gender,AG01andOver], init pop[gender,AG01andOver]) ~ person ~ |

A single-equation version would look something like this:

Code: Select all: Population = stock {gender, Age | 'init pop' } inflow { gender, AG00 => births(gender) } outflow deaths inflow 'net migrating'

Line-by-line explanation:

Population is defined as a stock with dimensions gender and Age and initial value init pop.
Births is defined as an inflow, capturing the information that Vensim puts into multiple equations for Population.
Deaths is defined as an outflow. Since all elements of Pop can use the same _expression_, no array syntax is needed here.
Net migration is defined as an inflow.

The tricks are located in line 1 and 2, which defines full dimensions (line 1) and multiple expressions (line 2).

Smia has many ways to express equations like the one above. One alternative syntax for line 2, makes use of index variables and dimension operators (==) and (:=):

Code: Select all: Population = stock {g==gender, a==Age | 'init pop' } inflow { g==*, a:=* | births(gender), 0 } outflow deaths inflow 'net migrating'

The operator == says that all elements of the dimension share the same _expression_, while := says that the first element has one _expression_, and the rest another one.

Best regards,
Magne

Magne Myrtveit: Posts: 51; Joined: Mon Jan 12, 2009 6:52 am

Top

Re: SMILE and XMILE

by Robert Eberlein » Mon Aug 12, 2013 7:31 am

Hi Magne,

I am with you on equations. I am just thinking in the specifications we write how we distinguish these from the encoding. I will ask Ignacio Martinez to jump in with comments as well as I know he has spent some time thinking about this.

You picked the easy subscript example, and I unfortunately can't see how the more complex one fits this pattern. If you could reconstruct the following it would be helpful to me:

WIP[Stage1]= INTEG (ProductionStarts-WIP[Stage1]/ProdTime[Stage1], 0) ~~|
WIP[StageMid]= INTEG (WIP[StagePrior]/ProdTime[StagePrior]-WIP[StageMid]/ProdTime[StageMid],0) ~~|
WIP[Stage22]= INTEG (WIP[Stage21]/ProdTime[Stage21]-shipments,0) ~ Widget ~|

in this case StageMid is Stage2...Stage21 and StagePriod is Stage1...Stage20 with a map to StageMid to make the second equation sensible.

Bob Eberlein

Robert Eberlein: Site Admin; Posts: 157; Joined: Sat Dec 27, 2008 8:09 pm

Top

Re: SMILE and XMILE

by Travis Franck » Tue Aug 13, 2013 8:32 am

Apologies in advance if I missed a subtly in the discussion -- it appears you both have thought about all this for a long time. But you did ask for company! ;)

In my head, there is the file format and what is printed in journals (i.e., the human readable form). The discussion appears to be whether the human-readable (say, SMILE) needs to be defined separate from the file format (XMILE) as a spec/standard.

Isn't the human-readable form pretty much defined already by standard mathematics? Right-hand side equals left-hand side? I guess I don't know how else you would write human-readable equations in a journal. I know there many fields, e.g., economics, that code a model in a programming language, but then use standard math formatting when publishing and discussing. I did the same in my SD work.

So, this would lead me to think that defining the XML format is most important. That is, we might be able to get by having XMILE without SMILE.

And when it comes to model documentation, people would archive the XMILE, use XMILE in their dissertation appendix, use 'math' formatting in their papers and dissertation discussion, and use a XMILE-to-HTML converters for cool documentation tools like Ignacio makes.

I write this acknowledging that I didn't fully follow the above thread about subscripts, but maybe this is a moment to pause and bring more of us up to speed.

~Travis

Travis Franck: Posts: 21; Joined: Sun Jan 11, 2009 7:48 pm

Top

Re: SMILE and XMILE

by Thomas Fiddaman » Tue Aug 13, 2013 2:18 pm

My possibly naive impression was that the purpose of SMILE is to define the constructs that constitute the equations and ancillary bits of a model, with XMILE as a file format for storing SMILE. But on reflection there seems to be quite a bit of overlap - both SMILE and XMILE define stocks and flows, for example, whereas an alternate approach would be to have a SMILE-like spec for what constitutes an equation, and then an XMILE spec for storing an equation that's agnostic about what kind of equation it is.

As long as XMILE defines an <eqn> field without caring what's in it, it seems that SMILE is needed in order to specify the operators, functions, naming etc. needed to interpret the RHS of the <eqn>.

Regarding naming, I don't mind the current options, but in the spirit of Travis' comment, we could use .J or .JK, with a nod to the dynamo integration syntax.

xmile message