OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

mqtt message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: [OASIS Issue Tracker] Commented: (MQTT-2) UTF-8 for will messages


    [ http://tools.oasis-open.org/issues/browse/MQTT-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=33273#action_33273 ] 

Nick O'Leary commented on MQTT-2:
---------------------------------

This has been discussed before on the mqtt.org mailing list, and is documented on the wiki: http://mqtt.org/wiki/doku.php/will_message_utf8_support

In summary, the spec is misleading to use terms like 'UTF-8' and ASCII when talking about the encoding of the will message. The will message payload is just like any other message payload - it is just a blob of bytes. The payload bytes might contain a string in some encoding, but that is only of interest to the application sending/receiving the message.

The spec uses the term 'UTF-8 encoded' as lazy shorthand for "encoded with two bytes to represent the length of the payload, followed by the payload itself".

The sentence you cite, although well intentioned to prevent misunderstand about what bytes get sent in the payload, in fact adds nothing but confusion and ought to be removed.









> UTF-8 for will messages
> -----------------------
>
>                 Key: MQTT-2
>                 URL: http://tools.oasis-open.org/issues/browse/MQTT-2
>             Project: OASIS Message Queuing Telemetry Transport (MQTT) TC
>          Issue Type: Improvement
>    Affects Versions: 3.1.1
>            Reporter: Dominik Obermaier
>
> The current 3.1 specification states that the will message is encoded in UTF-8 in the CONNECT message but will be published in ASCII encoding by a MQTT broker. This is a major inconsistency in the specification since this is the only case where ASCII encoding is used.
> Here's the relevant citation from the specification: 
> "Although the Will Message is UTF-8 encoded in the CONNECT message, when it is published to the Will Topic only the bytes of the message are sent, not the first two length bytes. The message must therefore only consist of 7-bit ASCII characters."
> A payload for a PUBLISH can of course be any raw bytes, in case of the will message we should think of removing the inconsistency from the spec. I see two possibilities:
> 1. The will message in the CONNECT message is *not* UTF-8 encoded but ASCII encoded. 
> 2. The will message in the will PUBLISH is UTF-8. This would collide with the current spec because empty payloads are possible regarding to the 3.1 spec (in case of UTF-8 IIRC two length bytes have to be sent even with an empty message). 
> I would vote for option two because this would remove this inconsistency in the spec and the will message is encoded in the CONNECT message in UTF-8 anyway. I don't think the overhead of the two length bytes in case of an empty message are a serious problem. We could discuss if it would be reasonable that in case of an empty payload (= empty UTF-8 String) the length bytes should be removed automatically by broker implementations to reduce the overhead in PUBLISH messages.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://tools.oasis-open.org/issues/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]