Kafka

Kafka is used primarily to decouple services from each other and reduce the pain of integration

The problem space

If you have 4 source systems and 6 target systems you need to write 24 integrations. These different systems may use different:

languages
different protocols (TCP, HTTP, REST, FTP, JDBC, e.t.c. ....)
different data formats (Binary, CSV, JSON
data schema

Also if you are using a request response system where by a request to a certain service calls another service which in turn calls another service. If the services update databases it becomes hard to roll back a transaction across multiple machines. This makes error handling problematic.

How Kafka solves the problem

Kafka allows you to decouple your source and target streams. So all source and target systems link to Kafka.

To solve the problem of an error been thrown when a service calls a chain of services, the mindset of a Kafka message queue is that if part of the workflow is blocked because a service is down, it can be coded in such a way that the service resumes processing once it is back online. This is not possible with Restful Services, gRPC or graphQL

Apache Kafka came from LinkedIn and is now mainly maintained by the company Confluence under the Apache Stewardship.

The architecture is:

distributed
resilient
fault tolerant

It scales horizontally and can scale to:

hundreds of brokers
millions of message per second

Higher performance

real time latency of less than 10ms

Use Cases for Kafka

Messaging System
Activity Tracking
Gathering metrics
Stream Processing (has a Kafka Stream API for that)
Decoupling System Dependencies
For big data integrations with Spark, Flink, Storm, Hadoop

Netflix uses Kafka to apply recommendations in real time

Uber uses Kafka to gather user, taxi and drip data in real time to compute and forecast demand and computer surge pricing in real time

LinkedIn uses Kafka to prevent spam, collect user interactions to make better connection recommendations in real time.

Links

Back: Overview of Tech

Page Author: JD

Page updated

Google Sites

Report abuse