davideicardi / kaa

Avro schema registry with Kafka persistency (Kafka Avro4s Schema Registry)

GitHub

kaa-schema-registry

Scala CI

(Kafka Avro4s Schema Registry)

Scala client library that provide an Avro schema registry with Kafka persistency and Avro4s serializer support. It allows to share avro schemas across multiple applications without third party software (it can replace Confluent Schema Registry). You can use this library with your Kafka client app without calling an external service for schema resolution.

For serialization Single object AVRO encoding is used to reduce records size, only a schema id (hash) is persisted within the record.

Features

Kaa provides essentially 3 features:

  • com.davideicardi.kaa.KaaSchemaRegistry: a simple embeddable schema registry that read and write schemas to Kafka
  • com.davideicardi.kaa.avro.AvroSingleObjectSerializer: an avro serializer/deserializer based on Avro4s that internally uses KaaSchemaRegistry
  • com.davideicardi.kaa.kafka.GenericSerde[T] an implementation of Kafka's Serde[T] based on AvroSingleObjectSerializer, that can be used with Kafka Stream

During serialization a schema hash is generated and stored inside Kafka with the schema (key=hash, value=schema). When deserializing the schema is retrieved from Kafka and used for the deserialization. KaaSchemaRegistry internally runs a Kafka consumer to read all schemas that will be cached in memory.

You can use com.davideicardi.kaa.KaaSchemaRegistryAdmin to programmatically create Kafka's schema topic. NOTE: if you want to create the topic manually, remember to put cleanup policy to compact to maintain all the schemas.

Why

The main advantage of Kaa is that it doesn't require an external services to retrieve schemas. This library automatically reads and writes to Kafka. This can simplify installation and configuration of client applications. This is especially useful for applications that already interact with Kafka.

Confluent Schema Registry on the other end requires you to install a dedicated service.

Prerequisites

Compiled with:

  • Scala 2.12, 2.13
  • Kafka 2.4
  • Avro4s 4.0

Usage

Official releases (published in Maven Central):

libraryDependencies += "com.davideicardi" %% "kaa" % "<version>"

Packages are also available in Sonatype, also with snapshots versions:

externalResolvers += Resolver.sonatypeRepo("snapshots")
// externalResolvers += Resolver.sonatypeRepo("public") // for official releases

Using AvroSingleObjectSerializer:

// create the topic
val admin = new KaaSchemaRegistryAdmin(brokers)
if (!admin.topicExists()) admin.createTopic()

// create the schema registry
val schemaRegistry = new KaaSchemaRegistry(brokers)
try {
    // create the serializer
    val serializerV1 = new AvroSingleObjectSerializer[SuperheroV1](schemaRegistry)

    // serialize
    val bytesV1 = serializerV1.serialize(SuperheroV1("Spiderman"))

    // deserialize
    val result = serializerV1.deserialize(bytesV1)
    println(result)
} finally {
    schemaRegistry.shutdown()
}

case class SuperheroV1(name: String)

See also

Contributing

Run unit tests:

sbt test

Run integration tests:

docker-compose up -d
sbt it:test
docker-compose down

Run example application:

docker-compose up -d
sbt sample/run
docker-compose down