Skip to main content

Designing a Schema (Avro)

Schemas should be designed to match entities that exists inside sub-domains. Avoid creating separate topics for individual C(R)UD (Create, Read/Retrieve, Update, Delete) events. The following guidelines should be applied:

  • Create one topic for one data type (e.g. SeatAvailability, NationalStopPlace, ProductOffer, SalesOrder)
  • Set the key for the topic to be the id for the entity. Keys should always be provided, including for deletions.
  • Recommended: Include a version number (or timestamp)

The reasoning for this is twofold:

  1. Keep it simple. Matching topic names to data types makes it easier to locate relevant data
  2. By spreading events related to domain objects across multiple topics, it quickly becomes unwieldy to retrieve and maintain the proper state of a given domain object, and helpful Kafka functionality is undermined.

This pattern lets a consumer subscribe to a single topic for a given data type, knowing that any updates will be published to the same topic. This includes deletions, which are represented as null payloads.

{
"namespace": "no.entur.sales.order",
"type": "record",
"name": "SalesOrder",
"doc": "The state of the order made by a customer, null when deleted",
"fields": [
{
"name: "orderid",
"type": "string"
"doc": "Unique id for each order in this system",
}, {
"name": "version",
"type": "int"
"doc": "Version of entity, 1 when created then 2,3,.. on updates",
},
..
]
}