[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [wss-comment] MIME header c14n question about the term "characterencoded"
After some more thought, I am thinking that perhaps double-quote and
backslash characters in quoted strings dont need to be modified at all
during canonicalization.
Here's my rationale:
RFC 822 states:
----------
3.4.1. QUOTING
Some characters are reserved for special interpretation, such
as delimiting lexical tokens. To permit use of these charac-
ters as uninterpreted data, a quoting mechanism is provided.
To quote a character, precede it with a backslash ("\").
This mechanism is not fully general. Characters may be quoted
only within a subset of the lexical constructs. In particu-
lar, quoting is limited to use within:
- quoted-string
- domain-literal
- comment
Within these constructs, quoting is REQUIRED for CR and "\"
and for the character(s) that delimit the token (e.g., "(" and
")" for a comment). However, quoting is PERMITTED for any
character.
-----------
For quoted characters other than double-quote and backslash, I
understand the rationale for canonicalizing them because quoting is
PERMITTED for any character. In other words, the header value
"hello"
can also be expressed as:
"\h\e\l\l\o"
but it still has the same semantic meaning, so clearly both of
these values need to be canonicalized to "hello"
--------------
On the other hand, the header value:
" \"hello\" "
can only possibly be expressed in this one manner because quoting is
REQUIRED for backslash and double-quote in quoted strings. I don't think
there is any other MIME-valid syntax that can be used to represent this
header value (although I may be wrong).
One might think that encoded-words (per RFC 2047) could be used,
but RFC 2047 specifically says:
-----------
5. Use of encoded-words in message headers
... An 'encoded-word' MUST NOT appear within a 'quoted-string'.
----------
Perhaps step 11 should be changed to read:
Quoted characters other than double-quote and backslash ("\") in quoted
strings in structured MIME headers (e.g. Content-ID) MUST be unquoted.
Double-quote and backslash ("\") characters in quoted strings in structured
MIME headers MUST *remain quoted*.
Can anyone give a reason that double-quote and backslash characters
in quoted strings need to be modified at all during canonicalization?
Regards,
Yassir.
Yassir Elley/Cambridge/IBM@IBMUS
05/10/2007 12:10 PM |
|
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]