mqtt message

Subject: [OASIS Issue Tracker] (MQTT-492) Validating the payload format
From: OASIS Issues Tracker <workgroup_mailer@lists.oasis-open.org>
To: mqtt@lists.oasis-open.org
Date: Thu, 24 Aug 2017 09:09:00 +0000 (UTC)
    [ https://issues.oasis-open.org/browse/MQTT-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=67180#comment-67180 ] 

Andrew Banks commented on MQTT-492:
-----------------------------------

The UTF-8 String described in section 1.5.4  describes the data used in fields such as the TopicName or UserName
it is not describing the payload in the publish packet. A publish packet of type  UTF-8 Encoded Character Data
can have any valid UTF-8 data in the payload, including null. 

The server is not required to validate the payload because this is burdensome
and the receiver is going to validate the data anyway, however, the server is allowed to validate the payload data if it chooses.

I believe the specification is fine as it stands, though we might want to underline the distinction between the UTF-8 string in section
1.5.4 and the publish palyoad. We might also want to explain why the server is not required to validate the correctness of the
publish  payload.




> Validating the payload format
> -----------------------------
>
>                 Key: MQTT-492
>                 URL: https://issues.oasis-open.org/browse/MQTT-492
>             Project: OASIS Message Queuing Telemetry Transport (MQTT) TC
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 5, CSD01
>            Reporter: Peter Niblett
>            Assignee: Ken Borgendale
>            Priority: Minor
>             Fix For: 5
>
>
> Section 3.3.2.3.2 says "•	1 (0x01) Byte Indicates that the Payload is UTF-8 Encoded Character Data. The UTF-8 data in the Payload does not include a length prefix, nor is it subject to the restrictions described in section 1.5.4."
> There are only two mandatory restrictions in 1.5.4
> -  "The character data in a UTF-8 Encoded String MUST be well-formed UTF-8 as defined by the Unicode specification [Unicode] and restated in RFC 3629 [RFC3629]. " 
> -  "A UTF-8 Encoded String MUST NOT include an encoding of the null character U+0000."
> I could see you might want to relax the second requirement, but it the string is not required to conform to the first one then you could put any sequence of bytes in the payload. 
> At the end of the section it says "The receiver MAY validate that the Payload is of the format indicated, and if it is not send a PUBACK, PUBREC, or DISCONNECT with Reason Code of 0x99 (Payload format invalid) as described in section 4.13."
> However if any sequence of bytes is permitted, how can it ever reject a payload? 
> Is the idea that receiver can choose what kind of validation it performs? For example receiver A could do no validation at all, receiver B could validate it's well-formed UTF8, receiver C could require it to be well-formed and not include U+0000 ?



--
This message was sent by Atlassian JIRA
(v6.2.2#6258)