Dead Letter Queue

A Dead Letter Queue is a Design Pattern where one moves messages to a dedicated Queue called “the Dead Letter Queue” if the message meets one or more Exception criteria.

Exception criteria

Queue does not exist

If the message is sent to a Queue that does not exist, it could be sent to the Dead Letter Queue.

This would have to be implemented on the Event Plane.

Queue length limit exceeded

Sometimes there is a limit to how many messages a Queue can hold. So if the Queue fills up because the messages are not being processed (or not being processed as fast as they are produced), then new messages cannot be placed in the Queue. To not lose those messages, they could be placed in the Dead Letter Queue instead.

This would have to be implemented on the Event Plane.

Message or Queue length limit exceeded

Sometimes there are restrictions to the size a message can have, or the maximum size a Queue can have. In either of those cases, if that limit is exceeded then new messages cannot be placed on the Queue. To not lose them, they could be placed in the Dead Letter Queue instead.

This would have to be implemented on the Event Plane. Some implementations, though, like AWS SQS for example, don’t support messages that exceed a certain size, period (256 kB in the example of AWS SQS). That means that if a message cannot be placed in the Queue because it’s too big, that it cannot be placed in the Dead Letter Queue either. To not lose that message, the Dead Letter Queue cannot be a Queue, but something else should serve as a Dead Letter Queue, like a file system. This would have to be implemented in the Application Layer, though.

Message is rejected by another Queue exchange

Some Event Plane implementations, like RabbitMQ, support explicitly rejecting a message (which could be done as part of Exception Handling). The Event Plane could be configured to send those messages to the Dead Letter Queue instead.

Message reaches a threshold read counter

A “Read Limit” is a feature that is provided by some Event Plane implementations to address the Racetrack Problem in the Middleware Layer instead of the Application Layer: if a messages was picked up from the Queue and then put back (because an Exception occurred while processing it, for example), then it probably never will get processed, and by moving it to the Dead Letter Queue one will not run into the Racetrack Problem.

The message expires

Many Event Plane implementations support a TTL. This should always be applied when Request-Response is used, and could be applied when Publish-Subscribe is used for notifications.

One could possibly configure the Event Plane to put expired messages to a Dead Letter Queue. The real question is whether you should. I think they answer is usually “No”. After all, the TTL was deliberately set because after that time the message is no longer relevant. So if it is rendered irrelevant by the TTL expiring, why retain that message?

The message is not processed correctly

If a message fails to process correctly, because of an Exception for example, the default operation is to put the message back in the Queue. This, however, could easily result in the Racetrack Problem, where the message gets processed over and over again and fails every time.

Instead of putting the message back in the Queue, it could be put in the Dead Letter Queue instead. This, however, would have to be implemented in the Application Layer, as that is the lowest level that is aware that the message fails to be processed. It is somewhat tricky to implement though:

You are already in an Exception state.
If Guaranteed Delivery is one of the requirements, meeting that requirement is not trivial when removing a message from one Queue and putting it in the Dead Letter Queue. It will be even less trivial if Exactly Once is also a requirement.

The simpler solution might be Message reaches a threshold read counter in those cases.

Handling the Dead Letter Queue

Redirecting messages to a Dead Letter Queue is only useful if one fixes the issues that got the messages in the Dead Letter Queue in the first place, and then reprocesses them, or deliberately decides to discard them.

Especially if there is an Incident that causes messages to be directed to the Dead Letter Queue, there could be a lot of messages there that need to be processed. One should think carefully about how to implement that.

There are two different approaches to handling the Dead Letter Queue.

One Dead Letter Queue for all messages

The advantage of this approach is that one would have one place where failed messages end up. This means that one also only has to Monitor one Queue for failed messages.

Because every kind of failed message now ends up in one Queue, and because there are a lot of them, Tooling is required to fix and reprocess messages efficiently. If one chooses for this approach one should pick an Event Plane implementation that provides those adequate tools.

Multiple dedicated Dead Letter Queues

In this scenario every Queue that makes use of Dead Letter Queues gets its own dedicated Dead Letter Queue.

The advantage of this approach is that you always know what kind of message you’re dealing with, because it only contains one kind of messages. If the problem is fixed and the messages need to be reprocessed, that, too, is easy, because one knows what Queue the messages came from.

The disadvantage of this approach is that you will have as many Dead Letter Queues as there are Queues. This many Queues is harder to Monitor than just one Queue. Also if there are many Dead Letter Queues and many different kinds of failed messages, one has to jump back and forth between a lot of Dead Letter Queues to resolve all issues.

Another thing to take into account is that not all Event Plane implementations allow for configuring which Queue to use for the Dead Letter Queue in different scenarios. This may result in that the Dead Letter Queue Pattern has to be implemented in the Application Layer instead of in the Middleware Layer.

Dead Letter Queue

​​ Exception criteria

​​ Queue does not exist

​​ Queue length limit exceeded

​​ Message or Queue length limit exceeded

​​ Message is rejected by another Queue exchange

​​ Message reaches a threshold read counter

​​ The message expires

​​ The message is not processed correctly

​​ Handling the Dead Letter Queue

​​ One Dead Letter Queue for all messages

​​ Multiple dedicated Dead Letter Queues

​​ Pages linking to this page