Command and Query Responsibility Segregation

Posted September 11, 2020 - 5 min read
Topics:  

In my previous post about microservices, we have reviewed both monolithic and microservices architectures. Today we are going to discuss CQRS (Command and Query Responsibility Segregation) pattern. We will start by understanding the problem CQRS is trying to solve. Then we will introduce the CQRS pattern and analyse its costs and benefits.

What is the problem CQRS is trying to solve?

To clearly understand the problem, we are going to consider a typical client-server application (monolith) which usually consists of 3 layers: an API layer (data access), a business layer (different services), and a data storage layer (database). The data between the client and the server are exchanged using pre-defined format.


    ┌──────────────┐
    │DATABASE LAYER│
    └──────┬───────┘
           │
    ┌──────┴───────┐
    │BUSINESS LAYER│
    └──────┬───────┘
           │
      ┌────┴────┐
      │API LAYER│
      └───▲──┬──┘
  REQUEST │  │
          │  │ RESPONSE
        ┌─┴──▼─┐
        │CLIENT│
        └──────┘

When our application evolves over time, the number of clients may increase as well as the data size. Subsequently, database queries (read/write queries) would also increase.

Depending on the domain of our application, the number of read queries may be greater than the number of write queries (or verse versa). So we may end in a situation where our database is busy all the time on serving the read queries. Whereas write queries may take more time to execute.

It becomes clear that one of the paths we need to take in order to scale up our application in to optimize database read and write queries.

There exist many solutions to this problem, such as caching, sharding strategies, master/slave pattern, message queuing , etc. But in this post we are going to focus on CQRS model.

CQRS

The idea behind CQRS model is to split the read and write operations (data mutations) into different systems. In the world of CQRS, the read operations are called Queries and the write operations are called Commands.


                  BACKGROUND THREAD / MESSAGE QUEUE
    ───────────▲────────────────────────────────────┬─────────────
               │                                    │
               │                                    │
       ┌───────┴─────────┐                  ┌───────▼────────┐
       │  DATA STORAGE   │                  │  DATA STORAGE  │
       │(WRITE OPTIMIZED)│                  │(READ OPTIMIZED)│
       └───────┬─────────┘                  └───────┬────────┘
               │                                    │
               │                                    │
     ┌─────────┴───────────┐              ┌─────────┴───────────┐
     │     WRITE LAYER     │              │     READ LAYER      │
     │(APPLICATION SERVICE)│              │(APPLICATION SERVICE)│
     └─────────▲───────────┘              └─────────▲───────────┘
               │                                    │
            SEND                                  SEND
           COMMANDS                              QUERIES
               │               ┌──────┐   RETURN    │
               └────ACKS/──────►CLIENT◄────DATA─────┘
                   UNACKS      └──────┘

Both read layer and write layer store data in different storages which are isolated from each other. Each storage may use its own data schema. For example, the read storage may use materialized views to optimize complex queries and avoid multiple joins or may even use a document oriented database. The write storage, on the other hand, may use a relational database to fulfill ACID requirements for data storage.

It is important to note that data from the read layer is eventually consistent and may not immediately reflect changes from the write storage. Changes from the write layer appear only after a given amount of time.

In fact, the write layer whenever it updates the database it publishes an event within a single transaction. The event is processed asynchronously using a background thread or a message queue. No matter which option is used, it should guaranty the recovery of unprocessed events in case of failures. Once the event is processed the read layer updates its data.

The separation of the read and the write layers allows scaling up each layer, independently, depending on its load. For example, using multiple replicas of the read layer can increase queries performance, in case when the read layer encounters higher load than the write layer.

CQRS and Event Sourcing

Event sourcing

In a few words event sourcing is a pattern of storing data as a sequence of events in an append-only storage. Each event represents an action in the system with a given payload. The event store acts as a unique source of truth for different consumers. Each consumer maintains a materialized view by replaying all past events to create a new state of an entity. The materialized views can be seen as a cache of the data.

Combining CQRS with Event Sourcing

Often the CQRS pattern is used together with Event sourcing. When used with event sourcing, the data storage of the write layer becomes the storage of events and the data from the read storage represents materialized views (denormalized views) which are highly tailored to match the UI requirements.

Considering the costs and benefits of using CQRS

It is obvious that CQRS, when implemented properly, provides the following advantages:

  • Independent scaling of data and application load in general.
  • Optimized read and write queries (storages).
  • Fulfilling the single responsibility pattern and separation of concerns.

However, when deciding to implement CQRS, it is important to consider:

  • The complexity that it introduces. It is not recommended to use CQRS with simple CRUD applications and applications with small to medium data size.
  • Data inconsistency issues. As noted before, the read storage is eventually consistent. If your application can not tolerate data not having an actual state if the data, it is not the way to go with.

Conclusion

Through this post we have reviewed CQRS and discussed many of its aspects more closely to get familiar with the pattern.

CQRS can be an efficient solution to be implemented for scaling out your application. When combined with event sourcing it may provide more advantages such as having multiple versions of your application models.

CQRS with event sourcing can be also useful in cases where the requirements of your application business layer is often changing, giving you the possibility to rebuild the materialized views or to add new one.