mqtt message

Subject: [OASIS Issue Tracker] (MQTT-216) Explore the downgrade of MUST level requirements to check UTF-8 encodings
From: OASIS Issues Tracker <workgroup_mailer@lists.oasis-open.org>
To: mqtt@lists.oasis-open.org
Date: Fri, 13 Feb 2015 14:55:26 +0000 (UTC)
    [ https://issues.oasis-open.org/browse/MQTT-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=57840#comment-57840 ] 

Ken Borgendale commented on MQTT-216:
-------------------------------------

The only UTF-8 strings which matters to performance is the topic name in PUBLISH and perhaps the topic filter in SUBSCRIBE.  The other strings are in CONNECT which is done once per connection.

Besides checking for UTF-8 validity, the server is also required to check the null character and wildcard characters (# and +) in the topic name of a PUBLISH, and the validity of the topic filter on a SUBSCRIBE.  There is a SHOULD for checking C0 and non characters in these strings.

In my simple testing of code to do these various checks on in Intel x86 1.7GHz processor using a single core, I am able to validate UTF-8 for 32 byte strings in 35ns (about 30 million per second).  The whole set of validations (UTF-8, C0, NonChar, and wildcard) takes me 48ns (about 21 million per second).  It seems unlikely that this represents any major portion of the processing time of a message.  Checking in the server is important to stop some forms of attacks.
 

> Explore the downgrade of MUST level requirements to check UTF-8 encodings
> -------------------------------------------------------------------------
>
>                 Key: MQTT-216
>                 URL: https://issues.oasis-open.org/browse/MQTT-216
>             Project: OASIS Message Queuing Telemetry Transport (MQTT) TC
>          Issue Type: Improvement
>          Components: futures
>    Affects Versions: 3.1.1
>            Reporter: Richard Coppen
>
> Following on from David Kemper's mqtt-comment on CSPRD02 the TC agreed to open this Jira to explore the relaxing of the MUST level requirements surround UTF-8 checking 
> Servers are required to scan the UTF-8 content of all strings in all packets. Since the strings themselves are bounded by a length prefix, is it possible that the following MUST level clauses could downgraded to SHOULD / SHOULD NOT to promote lower-latency solutions?
> [MQTT-1.5.3-1] The character data in a UTF-8 encoded string MUST be well-formed UTF-8 as defined by the Unicode specification [Unicode] and restated in RFC 3629[RFC3629]. In particular this data MUST NOT include encodings of code points between U+D800 and U+DFFF. If a Server or Client receives a Control Packet containing ill-formed UTF-8 it MUST close the Network Connection.
> [MQTT-1.5.3-2] A UTF-8 encoded string MUST NOT include an encoding of the null character U+0000. If a receiver (Server or Client) receives a Control Packet containing U+0000 it MUST close the Network Connection.



--
This message was sent by Atlassian JIRA
(v6.2.2#6258)