In our earlier blogpost, we had written about using MQTT as a protocol for real-time communication. We had started our work with Mosquitto — an Open-source MQTT broker to relay the communications from our devices to the server. While Mosquitto is an easy-to-setup and lightweight broker, we started finding some limitations in it as we started to scale beyond 15k daily active publishers.
Following are some of the major pointers we figured out in our journey —
- No Horizontal Scaling — Mosquitto doesn’t have an in-built support for horizontal scaling (via bridges/clusters) thus limiting scaling it beyond a scale. There are some hacks proposed in various forums (please refer to Stackoverflow discussion) but they don’t seem to be reliable
- Single-threaded — Mosquitto runs as a single-threaded application thus doesn’t allow us to take benefit of multi-core CPUs. This limits the maximum number of publishers on the system (please refer to Discussion section of the MQTT broker benchmarking article)
- Downtime — We observed in most cases that the restart time for Mosquitto broker remained around 6 to 7 minutes
In our search for an alternative to Mosquitto broker, we stumbled upon HiveMQ. Basis of our study, we found that it is the most enterprise-ready solution available, however, it comes with an added cost. Our decision process was to find a scalable free or open-source alternative, otherwise, plan a budget allocation for HiveMQ.
In our interactions during Fifth Elephant conference, we were introduced to VerneMQ (a high-performance, distributed MQTT message broker built-in Erlang) and received a very good feedback (blog mentioning their experience of using VerneMQ for more than a year). We started reading about it and soon rolled it out to production. Following were some key benefits with VerneMQ that helped us in making a switch from Mosquitto —
- Clustering — VerneMQ scales horizontally and vertically on commodity hardware to support a high number of concurrent publishers and consumers. Subscribers can connect to any cluster node and can receive messages from any other cluster node
- Installation and Support — VerneMQ has good documentation and a decently active community on github. Pre-build docker images are available for installation and it provides easy monitoring integrations too. (We are using Promotheus with Node Exporter and Grafana). Its website lists Microsoft, Volkswagen et. al as featured users.
- Parallel Processing — Given that VerneMQ is written in Erlang, it is capable of working on a multi-threaded manner. As discussed earlier, this capability allows to take the benefit of multi-core processors
Due to its easy installation and the way we had abstracted our broker from the rest of the architecture, switching to VerneMQ was a smooth and quick process for us. We instantly observed a significant decrease in our Drop Message count (from 14% to 5% on a base of 10M messages/day) and were able to gain benefits of clustering to achieve higher throughput. Following is the Graphana dashboard screenshot for the day-to-day metrics tracked for MQTT messages.
Some Configuration Tips
- When the queue has more than the configured number of messages at any given point, it will drop messages in bulk. This can be controlled by tweaking the parameter “max_online_messages”. The default value of this is set as 1000. We have currently set it up at 40,000 and that fulfills the majority of our use cases. You can set it as -1 for no limit, however, higher the number of max_online_messages, higher will be the RAM requirement
- While setting up VerneMQ cluster, all subscribers tend to subscribe to the earliest available node and maintain a persistent connection with that node. This results in an increased load on that particular node and unless that given node is down, subscribers don’t switch to another node. To counter this, we configured the “Persistent Client Expiration” option
- If your use case requires distributing messages to a set of subscribers on a shared subscription topic such that each message is received by only one subscriber, then configure the “shared_subscription_policy” of the broker. You can read more about how to configure this and its associated examples here
We’re happy to help in case you have any questions. Kudos to Robin Philip and Prakriti Kumari for their work in transitioning the system from Mosquitto to VerneMQ. Please reach out to us at email@example.com. We’re also hiring aggressively for multiple roles in our Technology team at Shadowfax. If this blogpost excites you on the work we’re doing, please reach out to us at firstname.lastname@example.org