47deg / pbdirect

Read/Write Scala objects directly to Protobuf with no .proto file definitions

Version Matrix

Build Status codecov.io Maven Central Latest version License Join the chat at https://gitter.im/47deg/pbdirect GitHub Issues

PBDirect

Read/Write Scala objects directly to Protobuf with no .proto file definitions

Context

Protobuf is a fast and efficient way to serialize data. While .proto files are great to share schema definitions between components, it is sometimes much simpler and straightforward to directly encode Scala object without using a .proto schema definition file.

PBDirect aims just that: Make it easier to serialize/deserialize into Protobuf.

Setup

In order to use PBDirect you need to add the following lines to your build.sbt:

libraryDependencies += "com.47deg" %% "pbdirect" % "0.5.0"

Dependencies

PBDirect depends on:

  • protobuf-java the Protobuf java library (maintained by Google)
  • shapeless for the generation of type-class instances
  • cats to deal with optional and repeated fields

Usage

In order to use PBDirect you need to import the following:

import pbdirect._

Example

Schema definition

PBDirect serialises case classes into protobuf and there is no need for a .proto schema definition file.

case class MyMessage(
  @pbIndex(1) id: Option[Int],
  @pbIndex(3) text: Option[String],
  @pbIndex(5) numbers: List[Int]
)

is equivalent to the following protobuf definition:

message MyMessage {
  int32  id              = 1;
  string text            = 3;
  repeated int32 numbers = 5;
}

Note that the @pbIndex annotation is optional. If it is not present, the field's position in the case class is used as its index. For example, an unannotated case class like:

case class MyMessage(
  id: Option[Int],
  text: Option[String],
  numbers: List[Int]
)

is equivalent to the following protobuf definition:

message MyMessage {
  int32  id              = 1;
  string text            = 2;
  repeated int32 numbers = 3;
}

Serialization

You only need to call the toPB method on your case class. This method is implicitly added with import pbdirect._.

val message = MyMessage(
  id = Some(123),
  text = Some("Hello"),
  numbers = List(1, 2, 3, 4)
)
val bytes = message.toPB
// bytes: Array(8, 123, 26, 5, 72, 101, 108, 108, 111, 40, 1, 40, 2, 40, 3, 40, 4)

Deserialization

Deserializing bytes into a case class is also straight forward. You only need to call the pbTo[A] method on the byte array containing the protobuf encoded data. This method is added implicitly on all Array[Byte] by importing pbdirect._.

val bytes: Array[Byte] = Array[Byte](8, 123, 26, 5, 72, 101, 108, 108, 111, 40, 1, 40, 2, 40, 3, 40, 4)
val message = bytes.pbTo[MyMessage]
// message: MyMessage(Some(123),Some(hello),List(1, 2, 3, 4))

Extension

You might want to define your own formats for unsupported types. E.g. to add a format to write java.time.Instant you can do:

import java.time.Instant
import cats.syntax.invariant._

implicit val instantFormat: PBFormat[Instant] =
  PBFormat[Long].imap(Instant.ofEpochMilli)(_.toEpochMilli)

If you only need a reader you can map over an existing PBScalarValueReader

import java.time.Instant
import cats.syntax.functor._

implicit val instantReader: PBScalarValueReader[Instant] =
  PBScalarValueReader[Long].map(Instant.ofEpochMilli)

And for a writer you simply contramap over it:

import java.time.Instant
import cats.syntax.contravariant._

implicit val instantWriter: PBScalarValueWriter[Instant] =
  PBScalarValueWriter[Long].contramap(_.toEpochMilli)
  )

Oneof fields

pbdirect supports protobuf oneof fields encoded as Shapeless Coproducts.

For example:

case class MyMessage(
  @pbIndex(1) number: Int,
  @pbIndex(2,3,4) coproduct: Option[Int :+: String :+: Boolean :+: CNil]
)

is equivalent to the following protobuf definition:

message MyMessage {
  int32 number = 1;
  oneof coproduct {
    int32 a  = 2;
    string b = 3;
    bool c   = 4;
  }
}

oneof fields with exactly two branches can also be encoded using Either. For example:

case class MyMessage(
  @pbIndex(1) number: Int,
  @pbIndex(2,3) either: Option[Either[String, Boolean]]
)

is equivalent to:

message MyMessage {
  int32 number = 1;
  oneof either {
    string b = 2;
    bool c   = 3;
  }
}

Support for oneof fields comes with a couple of restrictions:

  • oneof fields must have a @pbIndex annotation containing the indices of each of the sub-fields
  • The type of oneof fields must be a Coproduct (or Either) wrapped in Option[_]. This is so that pbdirect can set the value to None when the field is missing when reading a message from protobuf.

Default values and missing fields

When reading a protobuf message, pbdirect needs to handle missing fields by falling back to some default value. How it does this depends on the type of field.

The following table gives some examples of how pbdirect decodes missing fields:

Scala type Value given to missing field
Int/Short/Byte/Long 0
Double/Float 0.0
String ""
Array[Byte] empty array
List[_] empty list
Map[_, _] empty map
Scala Enumeration or enumeratum IntEnum the entry with value 0
Option[_] None
MyMessage an instance of MyMessage with all its fields set to their default values
Int :+: String :+: CNil (not supported)
Option[Int :+: String :+: CNil] None

If you have defined your own PBScalarValueReader by mapping over one of the built-in readers, you will get whatever value is produced by the default value of the underlying type.

For example, if your message looks like:

case class MyMessage(instant: Instant)

and you use the instantReader defined earlier, reading a message with the instant field missing would result in 1970-01-01T00:00:00Z.

Packed repeated fields

Primitive repeated fields (ints, floats, doubles, enums and booleans) are encoded using the protobuf packed encoding by default.

This behaviour can be overriden using the @pbUnpacked annotation:

case class UnpackedMessage(
  @pbUnpacked() ints: List[Int]
)

Fancy integer types (signed/unsigned/fixed-width)

You can tell pbdirect that an Int/Long field should be encoded in a special way by tagging its type with Signed, Unsigned or Fixed. For example:

import shapeless.tag.@@
import pbdirect.{Signed, Unsigned, Fixed}

case class IntsMessage(
  normalInt            : Int,
  signedInt            : Int @@ Signed,
  unsignedInt          : Int @@ Unsigned,
  fixedWidthInt        : Int @@ Fixed,
  fixedSignedWidthInt  : Int @@ (Signed with Fixed),
  normalLong           : Long,
  signedLong           : Long @@ Signed,
  unsignedLong         : Long @@ Unsigned,
  fixedWidthLong       : Long @@ Fixed,
  fixedSignedWidthLong : Long @@ (Signed with Fixed)
)

would correspond to the following Protobuf definition:

message IntsMessage {
  int32      normalInt              =   1;
  sint32     signedInt              =   2;
  uint32     unsignedInt            =   3;
  fixed32    fixedWidthInt          =   4;
  sfixed32   fixedSignedWidthInt    =   5;
  int64      normalLong             =   6;
  sint64     signedLong             =   7;
  uint64     unsignedLong           =   8;
  fixed64    fixedWidthLong         =   9;
  sfixed64   fixedSignedWidthLong   =   10;
}

You can also tag individual types inside coproducts, key and value types of maps and element types of lists:

case class AnotherIntsMessage(
  @pbIndex(1, 2) signedIntOrNormalInt  : (Int @@ Signed) :+: Int :+: CNil,
  @pbIndex(3)    signedIntFixedLongMap : Map[Int @@ Signed, Long @@ Fixed],
  @pbIndex(4)    signedIntList         : List[Int @@ Signed]
)

Copyright

pbdirect is designed and developed by 47 Degrees

Copyright (C) 2019-2020 47 Degrees. http://47deg.com