Event Sourcing

Instead of storing just the current state of the data in a domain, use an append-only store to record the full series of actions taken on that data. The store acts as the system of record and can be used to materialize the domain objects.
- Event Sourcing pattern

Comparison between CRUD Model and Event Sourcing

Let's consider a scenario that the user might take the following actions when submitting a blog post.

Create a draft with title "Hello World" and content "Nice to meet you!".
Publish the draft.
Update the title to "Hello Universe".

In traditional CRUD model, there will be only one database record for the blog post. After all three actions, this will be the final outcome.

title

content

is_published

"Hello Universe"

"Nice to meet you!"

true

In event sourcing, each of the actions will be translated into an event. After all three actions, there will be three corresponding events in the database.

stream_uuid

stream_version

event_type

event_data

"post:8dae5398-1914-45e3-957d-3d3b4fa23342"

"PostCreated"

{"title": "Hello World", "content": "Nice to meet you!"}

"post:8dae5398-1914-45e3-957d-3d3b4fa23342"

"PostPublished"

{}

"post:8dae5398-1914-45e3-957d-3d3b4fa23342"

"PostUpdated"

{"title": "Hello Universe"}

Each unique blog post will have an unique stream_uuid. The stream_version determines the sequence of the events.

To determine the current state of the blog post, we need to retrieve all the events from the database and apply them sequentially.

State after applying PostCreated.

{
  "uuid": "8dae5398-1914-45e3-957d-3d3b4fa23342",
  "title": "Hello World",
  "content": "Nice to meet you!",
  "is_published": false
}

State after applying PostPublished.

{
  "uuid": "8dae5398-1914-45e3-957d-3d3b4fa23342",
  "title": "Hello World",
  "content": "Nice to meet you!",
  "is_published": true
}

State after applying PostUpdated.

{
  "uuid": "8dae5398-1914-45e3-957d-3d3b4fa23342",
  "title": "Hello Universe",
  "content": "Nice to meet you!",
  "is_published": true
}

Benefits

Auditability: With all the events, it is easy to tell when and what has happened, and who is accountable.
Analyzability: Different metrics or views can be easily generated by replaying the events.
Event-driven: Event-sourcing is one way to achieve event-driven architecture. It provides all the benefits of the event-driven architecture like scalability, integration capability and resilience.

Considerations

Complexity: Comparing to CRUD model, event sourcing adds the complexity to track the changes as separate events.
Event Evolution: Events are append-only, there is no way to update the existing event schema. To accommodate new business requirements, new event schema will be created from time to time, while the old schema have to be maintained.
Eventual Consistency: Event sourcing is usually implemented together with the CQRS pattern that the read store is often eventual consistent.
Entity Uniqueness: For example, it is a common requirement that every user must have an unique email. However, an event-sourced system can hardly check for the uniqueness without adding further complexity.
De-identification: Data protection regulations like GDPR requires the system to be able to "forget" the user data upon request. It needs careful design to delete or anonymize those events.
Storage: The number of events will grow forever, so as the storage and computing requirement. Although the storage cost is cheaper over time, it is still a concern to store all cold data in hot storage like SSD.

Terminologies

There are many jargons when designing the event sourcing system.

Event Stream: An event stream is a group of related sequential events for an aggregate instance.
Aggregate: A domain-driven design (DDD) term to describe a container to hold the logic for a group of related entities. Think of it as a document in document database. For example, we can have the post aggregate that manages the post entity and the related comment entities.
Projection: A projection is a materialized view of the events. For example, a view_count projection can subscribe to the PostViewed event and produce the number of views per post.
Snapshot: A snapshot is a cache of the aggregate internal state. Over time, an aggregate instance can be constructed from thousands of events, which takes longer time to compute the current state. If the snapshot exists, only events created after the snapshot have to be applied.

Notes

While event sourcing is a very nice solution to achieve both auditability and event-driven architecture, it also requires very careful planning to implement, or it will take a considerable amount of effort to maintain. Alternatively, consider the traditional CRUD model plus outbox pattern to achieve the same.

PreviousETL Process NextOutbox Pattern

Last updated 2 years ago