[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [ebxml-msg] Multi-Hop Design Principles
I thought the 5 design principles I mentioned below were clear enough. But it looks like that each principle may have needed some detailed explanation to make sure it is completely clear.
For example, the last principle (principle #5) states that going from peer-to-peer architecture to a Multi-Hop one should not introduce a change in the implementation of the two MSHs. I thought this was clear enough, but Dales said something that made me think that people do not necessary understand things the way I understand them. Dales, during the conf call mentioned that he does not want the endpoint address of the PMode to be changed in order for the peer-to-peer exchange to be changed to a Multi-Hop. From what I understood from Dales comment (I was on my cell phone outside and it was very hard for me to follow the discussion on the conf call) is that Dales wants the sender MSH to be unaware of whether he is doing peer-to-peer or a multi-hop since the PMode the sender MSH is using has not been changed to accommodate the change in the endpoint address of the ultimate receiver MSH. At first, this sounds like an additional requirement principle, but when you think about it, it is not.
First of all, let us be clear about what principle #5 means. What is a “change in the implementation”? A minimal meaning for this sentence is that the source code should not be re-compiled. If the same compiled code works for both peer-to-peer and multi-hop, then the implementation has not been changed. If however, a change occurred in some configuration file, this is normally not considered “an implementation change”. But from Dales comment (at least from how I understood it), Dales is pushing this principle a little further by saying that even a change in configuration files is not allowed (like changing the endpoint address of the PMode to accommodate multi-hop case). This is not really a concern for me because it boils down to how things are deployed (so the answer to this question is really at a layer down below and the spec should not be concerned with such details). For example, in one deployment scenario, the person who owns the sender MSH may go and buy an extension module (a module that knows how to work in a multi-hop topology) and deploy such a module in the same stack in which the sender MSH lives. In this deployment scenario, the sender MSH code together with its configuration files have not been altered. The sender MSH may not even be aware that an extension module is deployed besides him in the same stack. In this deployment scenario, the sender MSH will always talk the same way whether it is peer-to-peer or in a multi-hop topology. This deployment case addresses Dales concern that the PMode is not changed. Other people may argue that deploying an extension could be considered a “change in the MSH implementation” even though the binaries and configuration files of the sender MSH did not change. But such an argument is not valid. Whether you like it or not, someone has to do the job to switch from a peer-to-peer to a multi-hop. If the sender MSH does not want to change any parameter in its PMode configuration, then a separate extension module would need to be deployed. There is no solution that, at the same time does not change the PMode data and also does not require the deployment of an extension module.
I did not have time yet to examine your proposal in details and convince you that it cannot be a solution. But I will if you really want to know my feedback.
When we met in the F2F meeting in September, there were two possible paths for Intermediaries: (a) Intermediaries MUST be 100% transparent and do only routing, (b) Intermediaries may participate in the processing of routed messages, including reading their headers, modifying them, even consuming some of the payloads, and possibly inserting other payloads, etc… Initially, I was for option (b) because I always like to solve problems in their most general case (the most complex case). But I quickly was convinced by the TC during the F2F meeting that it would be too complex to consider case (b). In fact case (b) is the equivalent of a Master thesis, and it is just not worth the effort to solve such a problem in a spec. However, case (a) is really extremely easy to solve (it is just an undergraduate problem and any good computer-science student can easily solve it).
To make case (a) clear, let me lay down the requirements (or mathematical axioms) that should be respected in solving problem (a). There are 5 axioms which are the following:
1. Content-Transparency: The Intermediaries MUST NOT modify a message, except a possible addition of a new header only if it would not break signatures. If a solution can avoid the addition of new headers, such a solution would be even better.
2. Processing-Transparency: The Intermediaries MUST NOT do any processing of the message except routing. This means that ebMS headers, Reliability headers and Security headers should not be processed or even understood by the Intermediaries. A simple logical consequence of this principle is that the Intermediaries CANNOT rely on the contents of the ebMS headers, or reliability headers to help them in routing decision making. The ebMS headers and/or reliability headers may as well be all encrypted for confidentiality purposes, and the Intermediaries are not supposed to read them (and cannot read them).
3. All Messages Are Equal: The routing of messages across intermediaries MUST be agnostic about the type of message (ebMS message, ack message, reliable message, create-sequence signal, etc...) being routed as long as the routing parameters (the parameters based on which the routing decisions are made) are present in the message. One simple logical consequence of this principle is that for example, piggy-backing a signal over another message just to solve the routing issues MUST not be allowed as a hack/solution. All types of messages are equals in terms of routing. Also as a consequence of principle #2, is that the routing parameters cannot be within ebMS headers and/or reliability headers.
4. No deployment of PModes across intermediaries is necessary. The Intermediaries SHOULD NOT have to know what are the PModes being used to help them in routing decisions such as whether to maintain a connection open for example.
5. A peer-to-peer (an MSH talking directly to another MSH) architecture could be changed to a Multi-Hop architecture without introducing a change in the implementation of the MSHs.
I do have a solution for problem (a), but I am willing to throw my solution in the trash and support your solution provided your solution respect the 5 principles mentioned above. So if you want to provide a solution, please keep in mind the 5 principles above when doing so, and you will have my full support.
In this scenario each leg in the communication between two endpoints is made reliable independent of the others. The end-to-end reliability is secured by relaying reliability signals between the legs (so a better name for this scenario would be relayed reliable messaging).
For the intermediairies the relaying means that they will copy each incoming reliability signal to an outgoing sequence to the next hop, which might be another intermediary or the endpoint.
- WS-RM is used to support the RM functionality
- Each intermediary knows the next hop for each ebMS message
- Only core V3 functionality is required at the endpoints
The communication using intermediaries involves three components, at each endpoint a WS-RM enabled MSH and one or more intermediaries which are a special kind of MSH's. The difference between endpoint and intermediary is the way they handle the WS-RM signals, the endpoints handles them according to the core specification and the intermediary as described here.
As the intermediary has to relay the reliability signals between two legs in the communication it must have an administration of the legs and accompanying sequences and how those legs/sequences are related. This administration, the SequenceTable, can be modeled as a collection with each entry in it containing the source MSH, the sequence related to the source, the destination MSH and the sequence related to the destination. Both source and destination can be either an endpoint or another intermediary.
Upon arrival of a CreateSequence signal an entry in the SequenceTable is created. Because the CreateSequence signal has no information about the ultimate destination only the source and sequence can be set. To set the other two fields the destination has to be known and a sequence to it established. The intermediary will know this information after the first ebMS user message is sent on the newly created sequence. Based on that message it will be able to determine the next hop and set up a sequence to it by sending a CreateSequence. This way the entry is completed and all subsequent reliability signals can be relayed to the correct destination.
Attachment components.png shows the components playing a role in the relayed ack scenario.
The attachment example_seq_diagram.png shows a sequence diagram for this scenario with one intermediary. In the messages.zip archive you'll find examples of messages exchanged between endpoint and intermediaries. In these example message there's no security used.
+ Flexible because reliability is realised per leg;
+ No additional functionality required on endpoint other than core specification;
+ All intermediaries have the functionality;
- Intemediaries modify SOAP headers;
- No end-to-end security possible on reliablity signals;