zio / zio-kafka   2.9.0

Apache License 2.0 Website GitHub

A Kafka client for ZIO and ZIO Streams

Scala versions: 3.x 2.13

ZIO Kafka

ZIO Kafka is a Kafka client for ZIO. It provides a purely functional, streams-based interface to the Kafka client and integrates effortlessly with ZIO and ZIO Streams. Often zio-kafka programs have a higher throughput than programs that use the Java Kafka client directly (see section Performance below).

Production Ready CI Badge Sonatype Releases Sonatype Snapshots javadoc ZIO Kafka Scala Steward badge

Introduction

Apache Kafka is a distributed event streaming platform that acts as a distributed publish-subscribe messaging system. It enables us to build distributed streaming data pipelines and event-driven applications.

Kafka has a mature Java client for producing and consuming events, but it has a low-level API. ZIO Kafka is a ZIO native client for Apache Kafka. It has a high-level streaming API on top of the Java client. So we can produce and consume events using the declarative concurrency model of ZIO Streams.

Installation

In order to use this library, we need to add the following line in our build.sbt file:

libraryDependencies += "dev.zio" %% "zio-kafka"         % "2.9.0"
libraryDependencies += "dev.zio" %% "zio-kafka-testkit" % "2.9.0" % Test

Snapshots are available on Sonatype's snapshot repository https://oss.sonatype.org/content/repositories/snapshots. Browse here to find available versions.

For zio-kafka-testkit together with Scala 3, you also need to add the following to your build.sbt file:

excludeDependencies += "org.scala-lang.modules" % "scala-collection-compat_2.13"

Example

Let's write a simple Kafka producer and consumer using ZIO Kafka with ZIO Streams. Before everything, we need a running instance of Kafka. We can do that by saving the following docker-compose script in the docker-compose.yml file and run docker-compose up:

version: '2'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
    ports:
      - 22181:2181
  
  kafka:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper
    ports:
      - 29092:29092
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

Now, we can run our ZIO Kafka Streaming application:

import zio._
import zio.kafka.consumer._
import zio.kafka.producer.{Producer, ProducerSettings}
import zio.kafka.serde._
import zio.stream.ZStream

object MainApp extends ZIOAppDefault {
  val producer: ZStream[Producer, Throwable, Nothing] =
    ZStream
      .repeatZIO(Random.nextIntBetween(0, Int.MaxValue))
      .schedule(Schedule.fixed(2.seconds))
      .mapZIO { random =>
        Producer.produce[Any, Long, String](
          topic = "random",
          key = random % 4,
          value = random.toString,
          keySerializer = Serde.long,
          valueSerializer = Serde.string
        )
      }
      .drain

  val consumer: ZStream[Consumer, Throwable, Nothing] =
    Consumer
      .plainStream(Subscription.topics("random"), Serde.long, Serde.string)
      .tap(r => Console.printLine(r.value))
      .map(_.offset)
      .aggregateAsync(Consumer.offsetBatches)
      .mapZIO(_.commit)
      .drain

  def producerLayer =
    ZLayer.scoped(
      Producer.make(
        settings = ProducerSettings(List("localhost:29092"))
      )
    )

  def consumerLayer =
    ZLayer.scoped(
      Consumer.make(
        ConsumerSettings(List("localhost:29092")).withGroupId("group")
      )
    )

  override def run =
    producer.merge(consumer)
      .runDrain
      .provide(producerLayer, consumerLayer)
}

Resources

Articles

Video

Example projects

  • Kafka BigQuery Express by Adevinta (November 2024) A production system to copy data from Kafka to BigQuery, safely and cost-effectively. (See also the video "Optimizing Data Transfer...".)
  • zio-kafka-showcase by Jorge Vásquez, Example project that demonstrates how to build Kafka based microservices with Scala and ZIO
  • zio-kafka-demo1 (December 2022), example consumer and producer using zio-kafka 2.0.5
  • zio-kafka-example-app by Ziverge (December 2020), example application using zio-kafka 0.8.0

Adopters

Here is a partial list of companies using zio-kafka in production.

Want to see your company here? Submit a PR!

Performance

By default, zio-kafka programs process partitions in parallel. The default java-kafka client does not provide parallel processing. Of course, there is some overhead in buffering records and distributing them to the fibers that need them. On 2024-11-23, we estimated that zio-kafka consumes faster than the java-kafka client when processing takes more than ~1.2ms per 1000 records. The precise time depends on many factors. Please see this article for more details.

If you do not care for the convenient ZStream based API that zio-kafka brings, and latency is of absolute importance, using the java based Kafka client directly is still the better choice.

Documentation

Learn more on the ZIO Kafka homepage!

Contributing

For the general guidelines, see ZIO contributor's guide.

Code of Conduct

See the Code of Conduct

Support

Come chat with us on Badge-Discord.

Credits

This library is heavily inspired and made possible by the research and implementation done in Alpakka Kafka, a library maintained by the Akka team and originally written as Reactive Kafka by SoftwareMill.

License

License

Copyright 2021-2024 Itamar Ravid and the zio-kafka contributors.