Modern Data Architectures

A Modern Data Architecture (MDA) allows you to process real-time streaming events in addition to more traditional data pipelines.

There are two primary approaches StrataHive recommends when building an MDA for your organization, each having their own strengths and weaknesses. The first approach is called a Lambda Architecture and has two different components: batch processing and stream processing. The second approach is called a Kappa Architecture where all data in your environment is treated as a stream.

The Analytics Continuum

In the ever-changing world of data and analytics, it can be challenging to assess how your organization is doing compared to the rest of the market and how to frame your data strategy. To better understand how your organization compares, Gartner defined “The Analytics Continuum” that lays out seven high-level tasks and plots them on the scale of analytics maturity and competitive advantage.

Modern Data Architecture - Lambda & Kappa Big Data Architectures_2

Lambda Data Architecture

The main advantage of implementing a Lambda-based MDA is that you can typically continue to use your existing batch ETL processes as the batch component. The only time this wouldn’t be true is if your existing systems are unable to handle the throughput of data your organization is seeing.

A well-known weakness of Lambda Data Architecture is that you now have to manage and maintain two separate systems to acquire data.

Modern Data Architecture - Lambda & Kappa Big Data Architectures_3

Kappa Data Architecture

The biggest advantage of Kappa-based MDA is that it is a simplification of the Lambda architecture and allows you to have only streaming services as your main source of data. This reduces the number of services and amount of code your organization has to maintain. Treating every data point in your organization as a streaming event also provides you the ability to ‘time travel’ to any point and see the state of all data in your organization.

One downside of Kappa Data Architecture is the need to re-process events in the case of errors. However, access to affordable, elastic compute makes this a minor issue.

Modern Data Architecture - Lambda & Kappa Big Data Architectures_4

Conclusion

Choosing the correct Modern Data Architecture is an important step in crafting your organization’s data strategy. This involves carefully assessing your organization’s current state architecture and planning for maximum flexibility to best serve the consumers of this data. Both Kappa and Lambda architectures will provide a strong foundation when constructing a broader data-oriented business.

Leave a Reply