Part 3: RabbitMQ Best Practice for High Availability - CloudAMQP (2024)

Many variables feed into the overall level of availability for your RabbitMQ setup.InPart 3 of RabbitMQ Best Practiceare we talking about recommended setup and configuration options for maximum uptime.We will mention standard settings, changes and plugins that can be used toreceive better availability.

We have been working with RabbitMQ a long time, and we have probably seenway more configuration mistakes than anybody else. We know how to configurefor optimal performance and how to get the most stable cluster.In this series are our knowledge shared!Please readpart 1for general best practice and dos and don’ts tips for RabbtitMQ.

Make sure your queues stay short

To get optimal performance, make sure your queues stay as short as possibleall the time. Longer queues impose more processing overhead. We recommend that queuesshould always stay around 0 for optimal performance.

Use the queue type Quorum Queues

Quorum Queues is a replicated queue that provide highavailability and data safety. CloudAMQP recommends the use of Quorum Queues

The reasons you should switch to Quorum Queuesand design flaws of classic mirrored queuesare described in the article.

Enable lazy queues

Update: Starting from RabbitMQ version 3.12, the queue mode will bedisregarded as classic queues will now exhibit similar behavior to lazyqueues.

In RabbitMQ 3.6 and larger, a feature calledlazy queueswas added. Lazy queues are queues where the messages are automatically storedto disk. Messages are only loaded into memory when they are needed. With lazyqueues, the messages go straight to disk and thereby the RAM usage is minimized,but the throughput time will be larger.

We have seen that lazy queues create a more stable cluster, withpredictable performance, which will increase the availability of the server.Your messages will not, without a warning, get flushed to disk. You will notsuddenly be taken by a performance hit. If you are sending a lot of messagesat once (e.g. processing batch jobs) or if you think that your consumers willnot keep up with the speed of the publishers all the time, we recommend youto enable lazy queues.

Cluster setup (RabbitMQ HA with 3 or more nodes)

We refer to the collection of nodes as a cluster.

Availability can be enhanced if clients can find a replica of data, even inthe presence of failures. The ability to access the cluster even if a nodein the cluster goes down.

CloudAMQP gives you the option to set up 3 or 5 node clusters.We have located each node in different availability zones (AWS), and queues areautomatically mirrored, replicated (HA) between availability zones.

If a node fails, the client will automatically reconnect to another node in the cluster.

You can read more about setup options on the different number of nodes in your clusterhereand about RabbitMQ clusteringhere.

Message queues are by default located on one single node but they are visibleand reachable from all nodes. To replicate queues across nodes in a cluster,see the documentation on high availability,HA.

Use persistent messages and durable queues

If you cannot afford to lose any messages, make sure that your queue is declaredas “durable” and your messages are sent with delivery mode "persistent".

In order to avoid losing messages in the broker, you need to be prepared forbroker restarts, broker hardware failure, or broker crashes. To ensure thatmessages and broker definitions survive restarts, we need to ensure thatthey are on disk. Messages, exchanges, and queues that are not durable andpersistent will be lost during a broker restart.

Federation between clouds

We do not recommend clustering between clouds or regions, and therefore no planspread nodes across regions or datacenters. If a whole cloud region goes down,your CloudAMQP clusterwill also go down - but it's not something that we have ever experienced. Clusternodes are spread across availability zones within the same region.

You can protect the setup against a region-wide outage by setting up two clustersin different regions and use federation between them.Federation is one of the ways by which a software system can benefit fromhaving multiple RabbitMQ brokers distributed on different machines.More information about federation can be found here:Federation - Migration, Exchange and Queue Federation

Do not enable HiPE

HiPE will increase server throughput at the cost of increased startup time.When you enable HiPE, RabbitMQ is compiled at start up. The drawback is thatthe startup time increases quite a lot too, 1-3 minutes. This might affectuptime during a server restart.

Do not set RabbitMQ Management statistics rate mode to detailed in production

Setting RabbitMQ Management statistics rate mode to detailed has a seriousperformance impact and should not be used in production.

Set limited use of priority queues

Each priority level uses an internal queue on the Erlang VM,which takes up some resources. In most use cases it is sufficent tohave no more than 5 priority levels.

Guide - RabbitMQ Best Practice

Continue with part 413 common RabbitMQ mistakesCloudAMQP - industry-leadingRabbitMQ as a ServiceSign UpGo back to part 2RabbitMQ Best practice for High Performance

Enjoy this article? Don't forget to share it with others. 😉

Part 3: RabbitMQ Best Practice for High Availability - CloudAMQP (2024)
Top Articles
Latest Posts
Article information

Author: Jonah Leffler

Last Updated:

Views: 6311

Rating: 4.4 / 5 (45 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Jonah Leffler

Birthday: 1997-10-27

Address: 8987 Kieth Ports, Luettgenland, CT 54657-9808

Phone: +2611128251586

Job: Mining Supervisor

Hobby: Worldbuilding, Electronics, Amateur radio, Skiing, Cycling, Jogging, Taxidermy

Introduction: My name is Jonah Leffler, I am a determined, faithful, outstanding, inexpensive, cheerful, determined, smiling person who loves writing and wants to share my knowledge and understanding with you.