mqtt message

Subject: [OASIS Issue Tracker] (MQTT-260) Add a CONNACK code of 'Try Another Server'
From: OASIS Issues Tracker <workgroup_mailer@lists.oasis-open.org>
To: mqtt@lists.oasis-open.org
Date: Wed, 6 Jul 2016 10:22:49 +0000 (UTC)
    [ https://issues.oasis-open.org/browse/MQTT-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=62860#comment-62860 ] 

Raphael Cohn edited comment on MQTT-260 at 7/6/16 10:21 AM:
------------------------------------------------------------

I think we're talking about a different scenario to that which I envisage this being appropriate for. In this scenario, your 50 servers aren't a cluster and don't logically represent one broker - they are 50 brokers. That's what I was trying to get at when I tried to delineate brokers and servers. So in this scenario your client needs a list of servers to try for its client id. How it gets that list is up to you to code for; it doesn't and shouldn't be part of the protocol, because there are a myriad of different ways it could do it. Perhaps even as crudely as GET https://brokerfarm1.ibm.com/where-do-i-connect-now?clientid=4556 (with suitable auth, of course).* Or another way to look at it is if you have 50 brokers, you have complexity already, so you'll use a bespoke MQTT load balancer (admittedly, the MQTT proxy story is quite weak right now, but I see moves to change that). More importantly, with such a set up, one clearly has significant funds and administrative staff. And a lot of clients. 10s of millions perhaps. So a load balancer is neither expensive, outside of the competencies of the organisation or something unfamiliar. And is probably needed for SSL terminations in any event.

The only possible client that an out-of-band protocol inconveniences are very small, sub-Arduino capable devices - there's a company in India using these to monitor power supplies to villages - which can not connect to more than one TCP socket at a time. However, if your sever has just told you it's busy, then the socket should be free... And for these devices, you've probably burnt the list into firmware, anyway - the amount of memory available is tiny - or you'll simply sleep and try again later (more likely; the network out to these locations is beyond woeful and the data being gathered is a few tens of bytes at most).

* Or perhaps one just uses a clock hash of the client id based on the number of brokers... (tongue firmly in cheeck).

Ultimately, the best way to solve the client state problem at scale is to get the client to manage the state. But that isn't then MQTT any longer. I think what I propose can solve the very vast majority of use cases, provides the greatest variety of options to implementors of client, brokers, and solutions, and doesn't handicap future use cases by keeping the necessary knowledge out-of-band.


was (Author: raphcohn):
I think we're talking about a different scenario to that which I envisage this being appropriate for. In this scenario, your 50 servers aren't a cluster and don't logically represent one broker - they are 50 brokers. That's what I was trying to get at when I tried to delineate brokers and servers. So in this scenario your client needs a list of servers to try for its client id. How it gets that list is up to you to code for; it doesn't and shouldn't be part of the protocol, because there are a myriad of different ways it could do it. Perhaps even as crudely as GET https://brokerfarm1.ibm.com/where-do-i-connect-now?clientid=4556 (with suitable auth, of course).* Or another way to look at it is if you have 50 brokers, you have complexity already, so you'll use a bespoke MQTT load balancer (admittedly, the MQTT proxy story is quite weak right now, but I see moves to change that). More importantly, with such a set up, you clearly have significant funds and administrative staff. And a lot of clients. 10s of millions perhaps.

The only possible client that an out-of-band protocol inconveniences are very small, sub-Arduino capable devices - there's a company in India using these to monitor power supplies to villages - which can not connect to more than one TCP socket at a time. However, if your sever has just told you it's busy, then the socket should be free... And for these devices, you've probably burnt the list into firmware, anyway - the amount of memory available is tiny - or you'll simply sleep and try again later (more likely; the network out to these locations is beyond woeful and the data being gathered is a few tens of bytes at most).

* Or perhaps one just uses a clock hash of the client id based on the number of brokers... (tongue firmly in cheeck).

Ultimately, the best way to solve the client state problem at scale is to get the client to manage the state. But that isn't then MQTT any longer. I think what I propose can solve the very vast majority of use cases, provides the greatest variety of options to implementors of client, brokers, and solutions, and doesn't handicap future use cases by keeping the necessary knowledge out-of-band.

> Add a CONNACK code of 'Try Another Server'
> ------------------------------------------
>
>                 Key: MQTT-260
>                 URL: https://issues.oasis-open.org/browse/MQTT-260
>             Project: OASIS Message Queuing Telemetry Transport (MQTT) TC
>          Issue Type: Improvement
>          Components: futures
>    Affects Versions: 5
>            Reporter: Raphael Cohn
>            Assignee: Raphael Cohn
>            Priority: Critical
>
> If we add a CONNACK return code of 'Try Another Server', this makes it easier for over-loaded servers to tell clients to redirect. This works in conjunction with MQTT-259, which advocates the use of DNS SRV records.
> Indeed, if we also added server-originated DISCONNECT packets with this return code, we could get clients to cleanly migrate to another server when a server is shutdown for maintenance.
> Please note, I do not favour the server also reporting which new server to connect to. There in lies the route to madness, as it means the current server has to know the state of all the others. That's intimate knowledge.



--
This message was sent by Atlassian JIRA
(v6.2.2#6258)