Scala JSON Schema

Support ukraine Build codecov Version

SBT dependencies:

Main module:

libraryDependencies += "com.github.andyglow" %% "scala-jsonschema" % <version> // <-- required

Other libraries:

libraryDependencies ++= Seq(
  "com.github.andyglow" %% "scala-jsonschema-core" % <version>,              // <-- transitive
  "com.github.andyglow" %% "scala-jsonschema-macros" % <version> % Provided, // <-- transitive
  // json bridge. pick one
  "com.github.andyglow" %% "scala-jsonschema-play-json" % <version>,         // <-- optional
  "com.github.andyglow" %% "scala-jsonschema-spray-json" % <version>,        // <-- optional
  "com.github.andyglow" %% "scala-jsonschema-circe-json" % <version>,        // <-- optional
  "com.github.andyglow" %% "scala-jsonschema-json4s-json" % <version>,       // <-- optional
  "com.github.andyglow" %% "scala-jsonschema-ujson" % <version>,             // <-- optional
  // joda-time support
  "com.github.andyglow" %% "scala-jsonschema-joda-time" % <version>,         // <-- optional
  // cats support
  "com.github.andyglow" %% "scala-jsonschema-cats" % <version>,              // <-- optional
  // refined support
  "com.github.andyglow" %% "scala-jsonschema-refined" % <version>,           // <-- optional
  // enumeratum support
  "com.github.andyglow" %% "scala-jsonschema-enumeratum" % <version>,        // <-- optional
  // zero-dependency json and jsonschema parser
  "com.github.andyglow" %% "scala-jsonschema-parser" % <version>             // <-- optional
)

Generate JSON Schema from Scala classes

The goal of this library is to make JSON Schema generation done the way all popular JSON reading/writing libraries do. Inspired by Coursera Autoschema but uses Scala Macros instead of Java Reflection.

Features

  • Supports Json Schema draft-04, draft-06, draft-07, draft-09, draft-12
  • Supports case classes
  • Supports value classes
  • Supports sealed trait enums
  • Supports sealed trait case classes
  • Supports recursive types
  • Supports scala.Enumeration
  • Treats scala.Option as optional fields
  • As well as treats fields with default values as optional
  • Any Iterable is treated as array
  • Pluggable Joda-Time Support
  • Pluggable Cats Support
  • Pluggable Refined Support
  • Pluggable Enumeratum Support
  • Supports generic data types

Types supported out of the box

  • Boolean
  • Numeric
    • Short
    • Int
    • Char
    • Double
    • Float
    • Long
    • BigInt
    • BigDecimal
  • String
  • Date Time
    • java.util.Date
    • java.sql.Timestamp
    • java.time.Instant
    • java.time.LocalDateTime
    • java.sql.Date
    • java.time.LocalDate
    • java.sql.Time
    • java.time.LocalTime
  • with JodaTime module imported
    • org.joda.time.Instant
    • org.joda.time.DateTime
    • org.joda.time.LocalDateTime
    • org.joda.time.LocalDate
    • org.joda.time.LocalTime
  • with Cats module imported
    • cats.data.NonEmptyList
    • cats.data.NonEmptyVector
    • cats.data.NonEmptySet
    • cats.data.NonEmptyChain
    • cats.data.NonEmptyMap
    • cats.data.NonEmptyStream (for scala 2.11, 2.12)
    • cats.data.NonEmptyLazyList (for scala 2.13)
    • cats.data.OneAnd
  • with Refined module imported you can refine original types with these
    • boolean
      • eu.timepit.refined.boolean.And
      • eu.timepit.refined.boolean.Or
      • eu.timepit.refined.boolean.Not
    • string
      • eu.timepit.refined.collection.Size
      • eu.timepit.refined.collection.MinSize
      • eu.timepit.refined.collection.MaxSize
      • eu.timepit.refined.collection.Empty
      • eu.timepit.refined.collection.NonEmpty
      • eu.timepit.refined.string.Uuid
      • eu.timepit.refined.string.Uri
      • eu.timepit.refined.string.Url
      • eu.timepit.refined.string.IPv4
      • eu.timepit.refined.string.IPv6
      • eu.timepit.refined.string.Xml
      • eu.timepit.refined.string.StartsWith
      • eu.timepit.refined.string.EndsWith
      • eu.timepit.refined.string.MatchesRegex
      • eu.timepit.refined.string.Trimmed
    • number
      • eu.timepit.refined.numeric.Positive
      • eu.timepit.refined.numeric.Negative
      • eu.timepit.refined.numeric.NonPositive
      • eu.timepit.refined.numeric.NonNegative
      • eu.timepit.refined.numeric.Greather
      • eu.timepit.refined.numeric.Less
      • eu.timepit.refined.numeric.GreaterEqual
      • eu.timepit.refined.numeric.LessEqual
      • eu.timepit.refined.numeric.Divisable
    • collection
      • eu.timepit.refined.collection.Size
      • eu.timepit.refined.collection.MinSize
      • eu.timepit.refined.collection.MaxSize
      • eu.timepit.refined.collection.Empty
      • eu.timepit.refined.collection.NonEmpty
  • with Enumeratum module enabled
    • enums based on EnumEntry/Enum
    • enums based on ValueEnumEntry/ValueEnum
  • Misc
    • java.util.UUID
    • java.net.URL
    • java.net.URI
  • Collections
    • String Map (eg. Map[String, T])
    • Int Map (eg. Map[Int, T])
    • Iterable[T]
  • Sealed Trait hierarchy of case objects (Enums)
  • Case Classes
    • default value
  • Sealed Trait hierarchy of case classes
  • Value Classes

Example

Suppose you have defined this data structures

sealed trait Gender

object Gender {

    case object Male extends Gender

    case object Female extends Gender
}

case class Company(name: String)

case class Car(name: String, manufacturer: Company)

case class Person(
    firstName: String,
    middleName: Option[String],
    lastName: String,
    gender: Gender,
    birthDay: java.time.LocalDateTime,
    company: Company,
    cars: Seq[Car])

Now you have several ways to specify your schema.

In-Lined

In simple words in-lined mode means you will have no definitions. Type you want to use as source for schema will be represented in json schema without reusable data blocks.

import json._

val personSchema: json.Schema[Person] = Json.schema[Person]

As result you will receive this:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "additionalProperties": false,
  "properties": {
    "middleName": {
      "type": "string"
    },
    "cars": {
      "type": "array",
      "items": {
        "type": "object",
        "additionalProperties": false,
        "properties": {
          "name": {
            "type": "string"
          },
          "manufacturer": {
            "type": "object",
            "additionalProperties": false,
            "properties": {
              "name": {
                "type": "string"
              }
            },
            "required": [
              "name"
            ]
          }
        },
        "required": [
          "name",
          "manufacturer"
        ]
      }
    },
    "company": {
      "type": "object",
      "additionalProperties": false,
      "properties": {
        "name": {
          "type": "string"
        }
      },
      "required": [
        "name"
      ]
    },
    "lastName": {
      "type": "string"
    },
    "firstName": {
      "type": "string"
    },
    "birthDay": {
      "type": "string",
      "format": "date-time"
    },
    "gender": {
      "type": "string",
      "enum": [
        "Male",
        "Female"
      ]
    }
  },
  "required": [
    "company",
    "lastName",
    "birthDay",
    "gender",
    "firstName",
    "cars"
  ]
}

Regular

Schema generated in Regular mode will contain so many definitions so many separated definitions you provide. Lets take a look at example code:

import json._

implicit val genderSchema: json.Schema[Gender] = Json.schema[Gender]

implicit val companySchema: json.Schema[Company] = Json.schema[Company]

implicit val carSchema: json.Schema[Car] = Json.schema[Car]

implicit val personSchema: json.Schema[Person] = Json.schema[Person]

Here we defined, besides Person schema, gender, company and car schemas. The result will be looking this way then.

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "additionalProperties": false,
  "properties": {
    "middleName": {
      "type": "string"
    },
    "cars": {
      "type": "array",
      "items": {
        "$ref": "#/definitions/com.github.andyglow.jsonschema.ExampleMsg.Car"
      }
    },
    "company": {
      "$ref": "#/definitions/com.github.andyglow.jsonschema.ExampleMsg.Company"
    },
    "lastName": {
      "type": "string"
    },
    "firstName": {
      "type": "string"
    },
    "birthDay": {
      "type": "string",
      "format": "date-time"
    },
    "gender": {
      "$ref": "#/definitions/com.github.andyglow.jsonschema.ExampleMsg.Gender"
    }
  },
  "required": [
    "company",
    "lastName",
    "birthDay",
    "gender",
    "firstName",
    "cars"
  ],
  "definitions": {
    "com.github.andyglow.jsonschema.ExampleMsg.Company": {
      "type": "object",
      "additionalProperties": false,
      "properties": {
        "name": {
          "type": "string"
        }
      },
      "required": [
        "name"
      ]
    },
    "com.github.andyglow.jsonschema.ExampleMsg.Car": {
      "type": "object",
      "additionalProperties": false,
      "properties": {
        "name": {
          "type": "string"
        },
        "manufacturer": {
          "$ref": "#/definitions/com.github.andyglow.jsonschema.ExampleMsg.Company"
        }
      },
      "required": [
        "name",
        "manufacturer"
      ]
    },
    "com.github.andyglow.jsonschema.ExampleMsg.Gender": {
      "type": "string",
      "enum": [
        "Male",
        "Female"
      ]
    }
  }
}

Definitions / References

There is a couple of ways to specify reference of schema.

  1. It could be generated from type name (including type args)
  2. You can do it yourself. It is useful when you want to provide couple of schemas with same type but with different validation rules.

So originally you use

import json._

implicit val someStrSchema: json.Schema[String] = Json.schema[String]

implicit val someArrSchema: json.Schema[Array[String]] = Json.schema[Array[String]]

println(JsonFormatter.format(AsValue.schema(someArrSchema)))
{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "array",
  "items": {
    "$ref": "#/definitions/java.lang.String"
  },
  "definitions": {
    "java.lang.String": {
      "type": "string"
    }
  }
}

See that java.lang.String?

To use custom name, just apply it.

import json._

implicit val someStrSchema: json.Schema[String] = Json.schema[String].toDefinition("my-lovely-string")

implicit val someArrSchema: json.Schema[Array[String]] = Json.schema[Array[String]]

println(JsonFormatter.format(AsValue.schema(someArrSchema, json.schema.Version.Draft04())))
{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "array",
  "items": {
    "$ref": "#/definitions/my-lovely-string"
  },
  "definitions": {
    "my-lovely-string": {
      "type": "string"
    }
  }
}

There is, though, one circumstance that will make you think twice defining implicit val someStrSchema: json.Schema[String] = Json.schema[String] as it will influence all string fields or components of your schema. Say you want to use simple string along with validated string for ID representation. As the library operates at compile time level it completely rely on type information and thus it limits us to only one solution: specify special types as types.

Use Value Classes.

case class UserId(value: String) extends AnyVal

case class User(id: UserId, name: String)

Then you can do

import json._

implicit val userIdSchema: json.Schema[UserId] = Json.schema[UserId].toDefinition("userId")

implicit val userSchema: json.Schema[User] = Json.schema[User]

println(JsonFormatter.format(AsValue.schema(someArrSchema)))

and expect

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "additionalProperties": false,
  "properties": {
    "id": {
      "$ref": "#/definitions/userId"
    },
    "name": {
      "type": "string"
    },
    "required": [
      "id",
      "name"
    ],
    "definitions": {
      "userId": {
        "type": "string"
      }
    }
  }
}

Validation

It is also possible to add specific validation rules to our schemas.

Available validations:

  • multipleOf
  • maximum
  • minimum
  • exclusiveMaximum
  • exclusiveMinimum
  • maxLength
  • minLength
  • pattern
  • maxItems
  • minItems
  • uniqueItems
  • maxProperties
  • minProperties

Example

import json._
import json.Validation._

implicit val vb = ValidationBound.mk[UserId, String]

implicit val userIdSchema: json.Schema[UserId] = Json.schema[UserId].toDefinition("userId") withValidation (
  `pattern` := "[a-f\\d]{16}"
)

Definition will look then like

{
  "userId": {
    "type": "string",
    "pattern": "[a-f\\d]{16}"
  }
}

Free objects

Sometimes you need to include some more relaxed structure like the json itself into your models. In such cases you want your final schema would contain something like this:

{
  "type": "object",
  "additionalProperties": true
}

In order to get this, you can use Schema.object.Free. Like in this Play-Json based example:

import play.api.libs.json._

// model
case class Payload(id: String, name: String, metadata: JsObject)

// metadata schema
implicit val metaSchema: json.Schema[JsObject] = json.Schema.`object`.Free[JsObject]()

// or alternatively define a metadata Predef in case you need this to not go to definition section of json-schema
// implicit val metaPredef: json.schema.Predef[JsObject] = json.schema.Predef(json.Schema.`object`.Free[JsObject]())

// payload schema
val payloadSchema: json.Schema[Payload] = Json.schema[Payload]

Also, there is API to make object definition Free (and vice versa, a Free definition Strict)

case class Person(name: String, age: Int)
val personSchema = Json.objectSchema[Person]
val freePersonSchema = personSchema.free
val strictPersonSchema = freePersonSchema.strict

strictPersonSchema == personSchema // equal

Joda Time

Joda Time Support allows you to use joda-time classes within your models. Here is an example.

import com.github.andyglow.jsonschema.JodaTimeSupport._
import org.joda.time._

case class Event(id: String, timestamp: Instant)

val eventSchema: Schema[Event] = Json.schema[Event]

println(JsonFormatter.format(AsValue.schema(eventSchema)))

results in

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "additionalProperties": false,
  "properties": {
    "id": {
      "type": "string"
    },
    "timestamp": {
      "$ref": "#/definitions/org.joda.time.Instant"
    }
  },
  "required": [
    "id",
    "timestamp"
  ],
  "definitions": {
    "org.joda.time.Instant": {
      "type": "string",
      "format": "date-time"
    }
  }
}

Cats

In order to enable integration with cats we not only add it to dependencies, we also need tp import the integration package.

import com.github.andyglow.jsonschema.CatsSupport._

// TODO: provide examples

Refined

For Refined types to get described accordingly we, besides adding integration to dependency list, need to import the integration package.

import com.github.andyglow.jsonschema.RefinedSupport._

// TODO: provide examples

Enumeratum

To stitch Enumeratum support in we need to, add correcponding integration to dependencies, as well as import the integration package.

import com.github.andyglow.jsonschema.EnumeratumSupport._

// TODO: provide examples

Json Libraries

The library uses its own Json model com.github.andyglow.json.Value to represent Json Schema as JSON document. But project contains additionally several modules which could connect it with library of your choice.

Currently supported:

  • Play Json
  • Spray Json
  • Circe
  • Json4s
  • uJson

Example usage: Play

import com.github.andyglow.jsonschema.AsPlay._
import json.schema.Version._
import play.api.libs.json._

case class Foo(name: String)

val fooSchema: JsValue = Json.schema[Foo].asPlay(Draft04())

Example usage: Spray

import com.github.andyglow.jsonschema.AsSpray._
import json.schema.Version._
import spray.json._

case class Foo(name: String)

val fooSchema: JsValue = Json.schema[Foo].asSpray(Draft04())

Example usage: Circe

import com.github.andyglow.jsonschema.AsCirce._
import json.schema.Version._
import io.circe._

case class Foo(name: String)

val fooSchema: Json = Json.schema[Foo].asCirce(Draft04())

Example usage: Json4s

import com.github.andyglow.jsonschema.AsJson4s._
import json.schema.Version._
import org.json4s.JsonAST._

case class Foo(name: String)

val fooSchema: JValue = Json.schema[Foo].asJson4s(Draft04())

Example usage: uJson

import com.github.andyglow.jsonschema.AsU._
import json.schema.Version._

case class Foo(name: String)

val fooSchema: ujson.Value = Json.schema[Foo].asU(Draft04())

Enumerations

A few words about enumeration support. Most of the time enumerations are enumerations, we don't need to know anything else except allowed values, that's it. But.. sometimes we need something more. Sometimes we need the specified values to show up some extra information. Some titles, descriptions, etc. json-schema doesn't support this, unfortunately. But we can work around this. We can make macro to generate oneof(const1, const2, const3, ...) instead of enum. For that you need to provide a special flag.

implicit val jsonSchemaFlags: Flag with Flag.EnumsAsOneOf = null

this should show up in implicit scope of the macro.

Example.. Say we have a Gender enum specified like this

sealed trait Gender
object Gender {
    case object Male extends Gender
    case object Female extends Gender
}

Usually Json.schema[Gender] returns something like this

{
  "type": "string",
  "enum": [
    "Male",
    "Female"
  ]
}

But after the flag added, what we have is

{
  "oneOf": [
    { "const": "Male" },
    { "const": "Female" }
  ]
}

With this said, we can add some titles and descriptions into our models. For example this model definition, with EnumsAsOneOf flag enabled

  sealed trait Gender
  object Gender {
    @title("The Male") case object Male extends Gender
    /** The Female
      */
    case object Female extends Gender
  }

will produce schema such as

{
  "oneOf": [
    {
      "title": "The Male",
      "const": "Male"
    },
    {
      "description": "The Female",
      "const": "Female"
    }
  ]
}

For better explanation on how to apply documentation tags to the model please refer to the next chapter.

Documentation

By documentation, we mean extra information that can be carried along with the schema in order to improve its clarity. This all basically is about support of 2 fields: title, description. There are 3 places where these fields may take a place.

  • root model level
  • definition level
  • one-of / all-of / any-of level

We have 3 ways to maintain documented models are supported.

  1. Annotations
  2. Config
  3. Scaladoc

Annotations

Scala-JsonSchema specifies 2 annotations that can help you specify a model @title and @description as well as fields @descriptions.

Example:

import json._
import json.schema._

@title("A Title")
@description("My perfect class")
case class Model(
    @description("A Param") a: String,
    @description("B Param") b: Int)

val schema = Json.objectSchema[Model]()

this, being translated into json, gets you

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "description": "My perfect class",
  "title": "A Title",
  "type": "object",
  "additionalProperties": false,
  "properties": {
    "a": {
      "type": "string",
      "description": "A Param"
    },
    "b": {
      "type": "integer",
      "description": "B Param"
    }
  },
  "required": [
    "a",
    "b"
  ]
}

Config

Another approach that you can use to keep your models concise, but documented is to provide documentation separately. As config.

Here is an example:

import json._

case class Model(a: String, b: Int)

val schema = Json.objectSchema[Model](
  "a" -> "A Param",
  "b" -> "B Param"
) .withDescription("My perfect class")
  .withTitle("A Title")

this, being translated into json, gets you the same effect as annotation based approach

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "description": "My perfect class",
  "title": "A Title",
  "type": "object",
  "additionalProperties": false,
  "properties": {
    "a": {
      "type": "string",
      "description": "A Param"
    },
    "b": {
      "type": "integer",
      "description": "B Param"
    }
  },
  "required": [
    "a",
    "b"
  ]
}

This approach also nicely fits when models are specified in separate module or external library.

Scaladoc

Also it is possible to infer descriptions from scaladoc. This allows to reuse scaladoc that you might want to have anyways. This approach has it's own drawbacks, though.

  • model classes must reside in the same module with schemas
  • it requires non-incremental build or full-rebuild to take effect

Example:

import json._

/** My perfect class
 * 
 * @param a A Param
 * @param b B Param
 */
case class Model(a: String, b: Int)
val schema = Json.objectSchema[Model]()

this, being translated into json, gets you

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "description": "My perfect class",
  "type": "object",
  "additionalProperties": false,
  "properties": {
    "a": {
      "type": "string",
      "description": "A Param"
    },
    "b": {
      "type": "integer",
      "description": "B Param"
    }
  },
  "required": [
    "a",
    "b"
  ]
}

One little difference comparing to previous approaches is that this way you can't have title specified.

Combined approach

All these 3 techniques can be used all together. The only thing you need to have in mind if going this way is that to extract different type of label Scala-JsonSchema will check certain sources in certain order.

Element Order
case class title Config -> Annotation -> Scaladoc
case class description Config -> Annotation -> Scaladoc
case class field description Config -> Annotation -> Scaladoc

Annotations

Annotation Scope Description
@readOnly Field Adds "readOnly": true to property definition
@writeOnly Field Adds "writeOnly": true to property definition