[ebxml-cppa] Discussions with XMLdsig about Messaging

ebxml-cppa message

I have included some of our discussions with Donald Eastlake -- XMLdsig Chair -- concerning implementation issues with ebXML Messaging. I probably need to put these in perspective...

One concern of the implementors is the way SOAP removes and adds elements as they pass through SOAP nodes. When Messaging created the dSig transforms, they were careful to exclude anything which might fit this criteria, i.e. anything with an actor=next. However, the concern has been expressed that in the adding and removing of XML elements, the in-between white-space will be disturbed. If even a space is added or a line terminator (CRLF) is removed, the signature will become invalid. If intermediaries are extremely careful, they can add/remove elements without causing a problem, but this is very problematic.

A second concern had to do with the process XMLdsig uses to sign payloads. For payloads, the digest is calculated over the payload, but not the MIME headers. The dSig group acknowledged this hole. The Messaging group decided this was not a significant enough security hole to worry about -- they must put a note in the spec.

In response to this concern, we asked one of our team, who knows Mr. Eastlake, to discuss this with him and report back. The attached correspondence is part of that response.

Here is Donald Eastlake's response. At the bottom he suggests posting a query to the w3c-ietf-xmldsig@w3.org mailing list about the issue. I could do this but I thought it might be better if an implementor did it, perhaps Sanjay. Direct experience counts for a lot when dealing with these kinds of issues. The mailing list is open so anyone can subscribe if you want to do that before sending a query. To subscribe send a message with the word "subscribe" in the body to: w3c-ietf-xmldsig-request@w3.org Jim ---------- Forwarded message ---------- Date: Wed, 19 Dec 2001 12:27:40 -0500 From: Donald E. Eastlake 3rd <dee3@torque.pothole.com> To: James M Galvin <galvin@eListX.com> Cc: dee3@torque.pothole.com Subject: Re: whitespace in ds:signedInfo Jim, Sorry for the delay in response... You have indeed hit the nail on the head... You need to stop changes to SignedInfo or have a smarter CanonicalizationMethod (CM). But there are limits to how smart it can be. At first glance, it seems that, since XMLDSIG defines SignedInfo, it should know all about it and be able to just specify a fixed way to canonicalize, maybe a way that removed "insignificant" white space. And it could remove some insignificant white space. But all this stuff is a bit more subtle than it looks. In my mind, the recommended CM has already changed from Canonical XML to Exclusive XML Canonicalization after lots of problems cropped up with namespaces. If you went with white space stripping, things might work better until you got some case where, maybe, there is mixed content in some parameter to some Transform algorithm or other case where such white space turns out to be significant! White space was certainly discussed in the WG but you can't generally tell if it's "significant" or not. Even if there is a DTD and validating parser and element content is specified so white space, for example, between an end tag and a start tag is defined as "insignficiant", the XML spec says it is still supposed to be given to the application and the application could therefore still act on it, even though it is labeled "insignificant". (I suppose they were thinking the applicaiton might want to preserve it to preserve formatting but there is no restriction to that use.) The only thing the WG could figure that didn't descend into a rats nest of complexity was to retain and serialize all such white space meaning that it has to be immutable to avoid breaking signatures. Donald PS: It would be perfectly reasonable to post a query about this to the XMLDSIG WG list. Or I could post a summary, either identifying your group or not as you wish...

I thought everyone should see this reply from Donald Eastlake regarding whitespace. Jim ---------- Forwarded message ---------- Date: Mon, 07 Jan 2002 00:03:50 -0500 From: Donald E. Eastlake 3rd <dee3@torque.pothole.com> To: "Cherian, Sanjay" <Sanjay_Cherian@stercomm.com> Cc: "'w3c-ietf-xmldsig@w3.org'" <w3c-ietf-xmldsig@w3.org>, "'galvin@eListX.com'" <galvin@eListX.com>, "'david@drummondgroup.com'" <david@drummondgroup.com>, "Damodaran, Suresh" <Suresh_Damodaran@stercomm.com> Subject: Re: About whitespace in ds:SignedInfo Hi, There are several questions here. On Transforming SignedInfo: We deliberately made a general transform chain unavailable on SignedInfo, at least among the mandatory or recommended to implement algorithms, because it is a geneal Artificial Intelligence problem to figure out if such transforms are safe. Note that they could completely screw around with all your References, Signature Algorithm, and themsevles. So they can trivially arrange for the post-transform version signature to always succeed or always fail and to look generally secure. So, for any application allowing Transforms over SignedIfno to be secure, you probably need a complete description of all allowed sets of Transforms... Anyway, to discourage this, we permitted only CanonnicalizationMethod. But, of course, it is an algorithm so it can actually do anything. (IE, you could define a CanonicalizationMethod that took a Transforms as an explicit parameter.) On WhiteSpace: As I see it, if intermediaries are going to generally reformat things to be pretty, then it is kind of hopeless to try to be secure. I believe there are actually three types of white space. White space inside tags is truly insignificant. It isn't even handed to applications. Then there is white space in the content of something with element only content that appears between content elements or before the first or after the last content element. This is defined by the XML spec as "insignificant", although only a validating parser can tell that. Nevertheless, such white space is required to be given to the application. Since the application can do anything with it, that it is flagged as "insignificant" has no security meaning. And if you have a non-valiating parser or if the element has mixed content, then it doesn't even have this meaningless "insignificant" flag affixed to it. Finally, there is actual content white space, which can also be changed if some intermediate node pretty printed <a>foo bar</a> as <a> foo bar </a> for example. Your suggestion: If you guys want to define a canonicalization function that deletes all pure white space text nodes and make support of it required for your application, you can do so. But I'm not sure of your logic in doing so but not dropping leading and trailing white space and/or changing internal runs of white space to one space or something else canonical. And I don't see how this can be safe and secure. You may know the schema for SignedInfo but it is harder to know the schema for every parameter of every Transform someone might use. Donald From: "Cherian, Sanjay" <Sanjay_Cherian@stercomm.com> Message-ID: <40AC2C8FB855D411AE0200D0B7458B2B04D636BD@scidalmsg01.csg.st ercomm.com> Date: Thu, 20 Dec 2001 10:32:54 -0600 >Hi, > >This could likely be a problem that has been debated a lot in the past but I >would appreciate if someone explained >(or pointed to a previous explanation on the mailing list or elsewhere) what >the current thinking is regarding this. > >The ebXML Messaging Services Specification ( >http://www.ebxml.org/specs/ebMS.pdf ) uses XMLDSIG to sign portions >of the ebXML message (an XML structure based on SOAP) and the ds:Signature >element is embedded into the signed >ebXML message. > >The signed ebXML message is subject to modification, passing through a >sequence of intermediary SOAP and ebXML >processors before reaching the ultimate recipient, which also validates the >signature. One such modification that is difficult >to predict is change in whitespace (indentation, CRLF characters) in the >ebXML message. The ds:SignedInfo element >is itself subject to such modification (being part of the ebXML message). > >A proposed modification to the ebXML Messaging Specification suggests this >XSL transform in the ds:Reference element: > > <Transform Algorithm="http://www.w3.org/TR/1999/REC-xslt-19991116"> > <xsl:stylesheet version="1.0" >xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > <xsl:strip-space elements='*'/>  > <xsl:template match='node()|@*'>  > <xsl:copy> > <xsl:apply-templates select='@*'/> > <xsl:apply-templates/> > </xsl:copy> > </xsl:template> > </xsl:stylesheet> > </Transform> > >the signature has been made robust to changes in trivial whitespace in the >ebXML message but outside the embedded >ds:SignedInfo elements. However, because the ds:SignedInfo elements are not >processed by any XSL transform (but by >the algorithm specified in ds:CanonicalizationMethod, for which we are using >http://www.w3.org/TR/2001/REC-xml-c14n-20010315 ), changes in trivial >whitespace in ds:SignedInfo invalidates the >signature. > >To give some context on how ebXML Messaging uses XMLDSIG, I have included an >sample signature element (taken from >an ebXML example in the ebXML specification), later in this email. > >The first guess would be that by using a 'smarter' canonicalization >algorithm for the ds:SignedInfo element, the ds:SignedInfo >element can also be made robust to changes in trivial whitespace. Rather >than use the schema-unaware canonicalization >algorithm ( http://www.w3.org/TR/2001/REC-xml-c14n-20010315 ), it would seem >that a different canonicalization algorithm >could be written which, being aware of the XMLDSIG schema, could do a better >job of eliminating trivial whitespace. > >This 'smarter' CanonicalizationMethod algorithm might certainly not be smart >enough to always know whether a particular >whitespace character is significant or not. However, if such an algorithm >was created that simply removed all whitespace where >an element has textual content consisting entirely of whitespace (the kind >xsl:strip-space would remove), people that don't have >very sophisticated Transform elements (that could get mangled) would be able >to benefit from it. > >In other words, a 'smarter' XMLDSIG schema-aware CanonicalizationMethod >algorithm could be published with the caveat that >changes in whitespace in descendants of ds:SignedInfo would become >irrelevant. In situations where this might be a problem, >the earlier schema-unaware canonicalization algorithm could be used instead. > >Just as an example, the stylesheet presented earlier, which is a parameter >to the ds:Transform element does not contain any >whitespace that needs to be preserved and so a CanonicalizationMethod >algorithm that removed all trivial whitespace would not >hurt us but actually benefit us a lot. > >Here is a typical Signature element, as it is used by the ebXML Messaging >Specification. > > <Signature xmlns="http://www.w3.org/2000/09/xmldsig#"> > <SignedInfo> > <CanonicalizationMethod >Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315" /> > <SignatureMethod >Algorithm="http://www.w3.org/2000/09/xmldsig#dsa-sha1"/> > <Reference URI=""> > <Transforms> > <Transform >Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-sign ature"/> > <Transform >Algorithm="http://www.w3.org/TR/1999/REC-xpath-19991116"> > <XPath> >not(ancestor-or-self::()[@SOAP:actor= > >"urn:oasis:names:tc:ebxml-msg:actor:nextMSH"] > >| ancestor-or-self::()[@SOAP:actor= > >"http://schemas.xmlsoap.org/soap/actor/next"]) > </XPath> > </Transform> > <Transform >Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315" /> > </Transforms> > <DigestMethod >Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/> > <DigestValue>...</DigestValue> > </Reference> > <Reference URI="cid://blahblahblah/"> > <DigestMethod >Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/> > <DigestValue>...</DigestValue> > </Reference> > </SignedInfo> > <SignatureValue>...</SignatureValue> > <KeyInfo>...</KeyInfo> > </Signature> > >I was just trying to convey my thoughts on this matter. Thanks for taking >the time to read this email. > >Regards, > >Sanjay J. Cherian >Sterling Commerce >Irving, TX >

Having spoken to Donald Eastlake I will try to replay what I think I learned. The short answer is what we expected, that being that we need another transform that deals with CRLF in some "standard" way. As I understand the issue, the XML DSIG group found and understood this issue. It is born out of the fact that in a DTD you can specify that whitespace is either significant or not. Thus if you use an XPath utility to describe an XML document you should get pack information that indicates the presence of text nodes and whether or not those nodes are significant. In the absence of a DTD that indicates otherwise all whitespace is considered signficant. Regardless it was in fact decided that an application should deal with this issue. The transform should be able to look just for text nodes that contain only whitespace and delete them, based on a 5 minute discussion so your mileage may vary. :-) I apologize I will not be on Friday's call so if this is unclear you can send email or I'll do my best to clarify on Monday's call. Enjoy, Jim

It was funny (at least to me). His first reaction was that the problem is you're mixing technologies, i.e., MIME and XML. Although perhaps not a helpful observation it is entirely accurate. He commented that XML DSIG and IOTP/TRADE (another IETF working group) considered this issue right upfront and explicitly chose to do everything in XML. In fact, this would be the ideal solution. ebXML Message should do everything in either XML or MIME. Mixing and matching brings these issues to the table. Anyway, given that this is where we are, he had one additional suggestion we didn't already consider: push all payloads into an XML document (you could use CDATA) so the only MIME header on the outside is application/xml. I also proposed and he agreed that we could just duplicate the MIME headers from all the body parts inside one (or more) Object elements in the Signature Element. We would then need to require matching these things up. Just for completeness, he also observed that the reference element has an optional type attribute. This would serve to authenticate the MIME Content-Type header but it would leave other headers exposed. So, it's an insufficient solution. Guess I'll say all this on the conference call tomorrow. Jim