logo

Maven

Talk is cheap. Show me the code.Linus Torvalds

Introduction

One of the essential aspects of FP is immutable data structures, better known in the FP jargon as values. It is a fact that, when possible, working with values leads to more readable code, easier to maintain, and fewer bugs. However, sometimes, it is at the cost of losing performance because the copy-on-write approach is inefficient for significant data structures. Here is where persistent data structures come into play.

Why doesn't exist a persistent Json in Scala? It's the question I asked myself when I got into FP. Since I found no answer, I decided to implement one by myself.

What to use json-values for and when to use it

  • You need to deal with Jsons, and you want to program following a functional style, using just functions and values.
  • For those architectures that work with JSON end-to-end, it's safe and efficient to have a persistent Json. Think of actors sending JSON messages one to each other for example.
  • You manipulate JSON all the time, and you'd like to do it with less ceremony. json-values is declarative and takes advantage of concepts from FP to define a powerful API.
  • Generating JSON to do Property-Based-Testing is child's play with json-values.
  • Generating specifications to validate JSON and parse strings or bytes very efficiently is a piece of cake.
  • Simplicity matters, and I'd argue that json-values is simple.
  • As Pat Helland said, Immutability Changes Everything!

Code wins arguments

JSON creation

import json.value.*

JsObj("name" -> JsStr("Rafael"),
      "languages" -> JsArray("Java", "Scala", "Kotlin"),
      "age" -> JsInt(1),
      "address" -> JsObj("street" -> JsStr("Elm Street"),
                         "coordinates" -> JsArray(3.32, 40.4)
                        )
     )

or using conversions:

import json.value.*
import json.value.Conversions.given

JsObj("name" -> "Rafael",
      "languages" -> JsArray("Java", "Scala", "Kotlin"),
      "age" -> 1,
      "address" -> JsObj("street" -> "Elm Street",
                         "coordinates" -> JsArray(3.32, 40.4)
                        )
     )

JSON validation

import json.value.spec.*

val spec = 
        JsObjSpec("name" ->  IsStr,
                  "languages" -> IsArrayOf(IsStr),
                  "age" -> IsInt,
                  "address" -> JsObjSpec("street" -> IsStr,
                                       "coordinates" ->  IsTuple(IsNumber,
                                                                 IsNumber
                                                                 )
                                      )
                 )
                 .withOptKeys("address")
    

You can customize your specs with predicates and operators:

import json.value.spec.*

val noneEmpty = IsStr(n => if n.nonEmpty then true else "empty name")
val ageSpec = IsInt(n => if n < 16 then "too young" else true)

val addressLocSpec = JsObjSpec("coordinates" ->  IsTuple(IsNumber,IsNumber))
val addressFullSpec =  
        JsObjSpec("street" -> noneEmpty,
                  "number" -> noneEmpty,
                  "zipcode" -> noneEmpty,
                  "country" -> noneEmpty
                 )     
val addressSpec = addressLocSpec or addressFullSpec

val spec = 
        JsObjSpec("name" ->  noneEmpty,
                  "languages" -> IsArrayOf(noneEmpty),
                  "age" -> ageSpec,
                  "address" -> addressSpec
                 )
                 .withOptKeys("address")
    
val errors:LazyList[(JsPath, Invalid)] = spec.validateAll(json)    

// Invalid:: (JsValue, SpecError)

As you can see, the predicates can return a string instead of a boolean to customize the error messages.

JSON parsing

You can get a parser from a spec to parse a string or array of bytes into a Json. Most of the json-schema implementations parse the whole Json and then validate it, which is very inefficient. json-values validates each element of the Json as soon as it is parsed. Failing fast is important too!

On the other hand, it uses the library jsoniter-scala to do the parsing, which is extremely fast and has a great API.

import json.value.JsObj     
import json.value.spec.parser.*

val parser:JsObjSpecParser = spec.parser

val json:JsObj = parser.parse("{}")

JSON generation

import json.value.gen.*
import json.value.*

val gen = 
      JsObjGen(name -> Gen.alphaStr.map(JsStr),
               languages -> JsArrayGen.of(Gen.oneOf("scala", "java", "kotlin")).distinct,
               age -> Arbitrary.arbitrary[Int].map(JsInt),
               address -> JsObjGen(street -> Gen.asciiStr.map(JsStr),
                                   coordinates -> TupleGen(Arbitrary.arbitrary[BigDecimal]
                                                                    .map(JsBigDec),
                                                           Arbitrary.arbitrary[BigDecimal]
                                                                    .map(JsBigDec)))
               )
        
                

or using conversions to avoid writing the map method:

import json.value.gen.*
import json.value.gen.Conversions.given          

          
val gen = 
      JsObjGen(name -> Gen.alphaStr,
               languages -> JsArrayGen.of(Gen.oneOf("scala", "java", "kotlin")).distinct,
               age -> Arbitrary.arbitrary[Int],
               address -> JsObjGen(street -> Gen.asciiStr,
                                   coordinates -> TupleGen(Arbitrary.arbitrary[BigDecimal],
                                                           Arbitrary.arbitrary[BigDecimal]))
              )
        
                

When testing, it's important to generate valid and invalid data according to your specifications. Generators and specs can be used for this purpose:

import json.value.gen.*

val gen = 
      JsObjGen("name" -> Gen.alphaStr,
               "languages" -> JsArrayGen.of(Gen.oneOf("scala", "java", "kotlin")).distinct,
               "age" -> Arbitrary.arbitrary[Int],
               "address" -> JsObjGen("street" -> Gen.asciiStr,
                                     "coordinates" -> TupleGen(Arbitrary.arbitrary[BigDecimal],
                                                               Arbitrary.arbitrary[BigDecimal]
                                                               )
                                     )
                                     .withOptKeys("street","coordinates")
                                     .withNullValues("street","coordinates")
               )
               .withOptKeys("name","languages","age","address")
               .withNullValues("name","languages","age","address")
               
val (validGen, invalidGen) = gen.partition(spec)  

            

JSON manipulation

Crafting safe and composable functions free of NullPointerException with optics is a piece of cake:

import json.value.*
import monocle.{Lens ,Optional}

val nameLens:Lens[JsObj,String] = JsObj.lens.str("name")

val ageLens:Lens[JsObj,Int] = JsObj.lens.int("age")

val cityOpt:Optional[JsObj,String] = JsObj.optional.str(root / "address" / "city")

val latitudeOpt:Optional[JsObj,Double] = JsObj.optional.double(root / "address" / "coordinates" / "latitude")

//let's craft a function using lenses and optionals

val fn:Function[JsObj,JsObj]  = 
    ageLens.modify(_ + 1)
           .andThen(nameLens.modify(_.trim))
           .andThen(cityOpt.set("Paris"))
           .andThen(latitudeLens.modify(lat => -lat))
           
         
val updated = fn(person)

Neither if-else expressions nor null checks.I'd say it's pretty expressive and concise. As you may notice, each field has defined an associated optic, and we create functions, like fn in the previous example, putting them together (composition is key to handle complexity).

Filter,map and reduce were never so easy!

These functions traverse the whole Json recursively:

          
json.mapKeys(_.toLowerCase)
    .map(JsStr.prism.modify(_.trim))
    .filter(_.noneNull)
    .filterKeys(!_.startsWith("$"))
                    

JsPath

A JsPath represents the location of a specific value within a Json. It's a sequence of Position, being a position either a Key or an Index.

import json.value.JsPath

val a:JsPath = JsPath.root / "apple" / "carrot" / "melon"
val b:JsPath = JsPath.root / 0 / 1

val ahead:Position = a.head
ahead == Key("apple")

val atail:JsPath = a.tail
atail.head == Key("carrot")
atail.last == Key("melon")

val bhead:Position = b.head
bhead == Index(0)

//appending paths
val c:JsPath = a / b
c.head == Key("apple")
c.last == Index(1)

//prepending paths
val d:JsPath = a \ b
d.head == Index(0)
d.last == Key("melon")

JsValue

Every element in a Json is a subtype of JsValue. There is a specific type for each value described in json.org:

  • String
  • Number
  • Null
  • JSON object
  • JSON array

There are five number specializations:

  • Integer
  • Long
  • Double
  • BigDecimal
  • BigInteger

json-values adds support for the Instant type. Instants are serialized into their string representation according to ISO-8601.

When it comes to the equals method, json-values is data-oriented. I mean, two JSON are equals if they represent the same piece of information. For example, the following JSONs xs and ys have values with different primitive types, and their keys don't follow the same order:

val xs = JsObj("a" -> JsInt(1000),
               "b" -> JsBigDec(BigDecimal.valueOf(100_000_000_000_000L)),
               "c" -> JsInstant(Instant.parse("2022-05-25T14:27:37.353Z"))
              )

val ys = JsObj("b" -> JsBigInt(BigInteger.valueOf(100_000_000_000_000L)),
               "a" -> JsLong(1000L),
               "c" -> JsStr("2022-05-25T14:27:37.353Z")
              ) 

Nevertheless, since both objects represent the same piece of information:

{
  "a": 1000,
  "b": 100000000000000,
  "c": "2022-05-25T14:27:37.353Z"
}

It makes sense that both of them are equal objects. Therefore, they have the same hashcode. The best way of exploring JsValue is by applying an exhaustive pattern matching:

import json.value.*

val value: JsValue = ???

value match
  case primitive: JsPrimitive => primitive match

    case JsBool(b) => println("I'm a boolean")
    case JsNull => println("I'm null")
    case JsInstant(i) => println("I'm an instant")
    case JsStr(str) => println("I'm a string")
    
    case number: JsNumber => number match

      case JsInt(i) => println("I'm an integer")
      case JsDouble(d) => println("I'm a double")
      case JsLong(l) => println("I'm a long")
      case JsBigDec(bd) => println("I'm a big decimal")
      case JsBigInt(bi) => println("I'm a big integer")
  
  case json: Json[_] => json match

    case o: JsObj => println("I'm an object")
    case a: JsArray => println("I'm an array")
  
  case JsNothing => println("I'm a special type!")

The singleton JsNothing represents nothing. It's a convenient type that makes functions that return a JsValue total on their arguments. For example, the Json function

Json :: apply(path:JsPath):JsValue

is total because it returns a JsValue for every JsPath. If there is no element located at the given path, it returns JsNothing. On the other hand, inserting JsNothing at a path is like removing the element located at the path.

Creating Jsons

There are several ways of creating Jsons:

  • Using the apply methods of companion objects and passing in a map or a sequence of path/value pairs.
  • Parsing an array of bytes, a string, or an input stream. When possible, it's always better to work on a byte level. On the other hand, if the schema of the Json is known, the fastest way is to define a spec.
  • From an empty Json, and then using the API to insert new values.

Creating JsObjs

From a Map using the arrow notation:

import json.value.JsObj
import json.value.JsArray
import json.value.Conversions.given

val person = JsObj("type" -> "Person",
                   "age" -> 37,
                   "name" -> "Rafael",
                   "gender" -> "MALE",
                   "address" -> JsObj("location" -> JsArray(40.416775,
                                                            -3.703790
                                                           )
                                     ),
                   "book_ids" -> JsArray("00001",
                                         "00002"
                                        )
                   )

From a sequence of path/value pairs:

import json.value.Conversions.given

JsObj.pairs((root / "type","@Person"),
            (root / "age", 37),
            (root / "name", "Rafael"),
            (root / "gender", "MALE"),
            (root / "address" / "location" / 0, 40.416775),
            (root / "address" / "location" / 1, 40.416775),
            (root / "books_ids" / 0, "00001"),
            (root / "books_ids" / 1, "00002"),
           )

Parsing a string or array of bytes, and the schema of the Json is unknown:

val str:String = ??? 
val bytes:Array[Byte] = ??? 

val a:JsObj= JsObj.parse(str)
val b:JsObj= JsObj.parse(bytes)

Parsing a string or array of bytes, and the schema of the Json is known. We can create a spec to define the structure of the Json and then get a parser:

val spec:JsObjSpec = JsObjSpec("a" -> IsInt,
                               "b" -> IsStr,
                               "c" -> IsBool,
                               "d" -> JsObjSpec("e" -> IsLong,
                                                "f" -> IsTuple(IsNumber,IsNumber)
                                               ),
                               "e" -> IsArrayOf(IsStr)
                              )

val parser = spec.parser //reuse this object

From an empty Json and using the updated function of the API:

import json.value.Conversions.given

JsObj.empty.updated(root / "a" / "b" / 0, 1)
           .updated(root / "a" / "b" / 1, 2)
           .updated(root / "a" / "c", "hi")

Creating JsArrays

From a sequence of values:

import json.value.JsArray
import json.value.Converisons.given

JsArray(1,2,3)

JsArray("a","b","c")

JsArray("a", 1, JsObj("a" -> 1, "b" -> 2), JsNull, JsArray(0,1))

From a sequence of path/value pairs:

import json.value.JsArray
import json.value.Conversions.given

JsArray((root / 0, "a"),
        (root / 1, 1),
        (root / 2 / "a", 1),
        (root / 3, JsNull),
        (root / 4 / 0, 0),
        (root / 4 / 1, 1)
       )

Parsing a string or array of bytes, and the schema of the Json is unknown:

import json.value.JsArray

val str:String = ??? 
val bytes:Array[Byte] = ??? 

val a = JsArray.parse(str)
val b = JsArray.parse(bytes)

Parsing a string or array of bytes, and the schema of the Json is known. We can create a spec to define the structure of the Json array:

import json.value.spec.*
import json.value.spec.parser.*

val spec = IsTuple(IsStr,
                   IsInt,
                   JsObjSpec("a" -> IsStr),
                   IsStr.nullable,
                   IsArrayOf(IsInt)
                   )

val parser = spec.parser //reuse this object

val str:String = ??? 
val bytes:Array[Byte] = ??? 

val a = parser.parse(str)
val b = parser.parse(bytes)

From an empty array and using the appended and prepended functions of the API:

import json.value.*

JsArray.empty.appended("a")
             .appended("1")
             .appended(JsObj("a" -> 1, "b" -> true))
             .appended(JsNull)
             .appended(JsArray(0,1))

Putting data in and getting data out

There is one function to put data in a Json specifying a path and a value:

JsObj::   updated(path:JsPath, value:JsValue, padWith:JsValue = JsNull):JsObj
JsArray:: updated(path:JsPath, value:JsValue, padWith:JsValue = JsNull):JsArray

The updated function always inserts the value at the specified path, creating any needed container and padding arrays when necessary.

import json.value.Conversions.given
import json.value.*

// always true: if you insert a value, you'll get it back
json.updated(path, value)(path) == value 

JsObj.empty.updated(root / "a", 1) == JsObj("a" -> 1)
JsObj.empty.updated(root / "a" / "b", 1) == JsObj("a" -> JsObj("b" -> 1))

//padding array with emtpy strings
JsObj.empty.updated(root / "a" / 2, "z", padWith="") == JsObj("a" -> JsArray("","","z"))

New elements can be appended and prepended to a JsArray:

appended(value:JsValue):JsArray

prepended(value:JsValue):JsArray

appendedAll(xs:IterableOne[JsValue]):JsArray

prependedAll(xs:IterableOne[JsValue]):JsArray

Filter,map and reduce

The functions filter, filterKeys, map, mapKeys, and reduce traverse the whole json recursively. All these functions are functors (don't change the structure of the Json).

val json = ???

val toLowerCase:String => String = _.toLowerCase

json mapKeys toLowerCase

json map JsStr.prism.modify(_.trim)

val isNotNull:JsPrimitive => Boolean = _.noneNull

json filter isNotNull

Flattening a Json

A Json can be seen as a set of (JsPath,JsValue) pairs. The flatten function returns a lazy list of pairs:

Json:: flatten:LazyList[(JsPath,JsValue)]

Returning a lazy list decouples the consumers from the producer. No matter the number of pairs that will be consumed, the implementation doesn't change.

Let's put an example:

val obj = JsObj("a" -> 1,
                "b" -> JsArray(1,"m", JsObj("c" -> true, "d" -> JsObj.empty))
               )

obj.flatten.foreach(println) // all the pairs are consumed

// (a, 1)
// (b / 0, 1)
// (b / 1, "m")
// (b / 2 / c, true)
// (b / 2 / d, {})

Specs

A Json spec defines the structure of a Json. Specs have attractive qualities like:

  • Easy to write. You can define Specs in the same way you define a raw Json.
  • Easy to compose. You glue specs together and create new ones easily.
  • Easy to extend. There are predefined specs that cover the most common scenarios. Nevertheless, you can create any imaginable spec from predicates.

Let's go straight to the point and put an example:

import json.value.spec.*

val personSpec = JsObjSpec("@type" -> IsCons("Person"),
                           "age" -> IsInt,
                           "name" -> IsStr,
                           "gender" -> IsEnum("MALE","FEMALE"),
                           "address" -> JsObjSpec("location" -> IsTuple(IsNumber,
                                                                        IsNumber
                                                                        )
                                                 ),
                           "books_id" -> IsArrayOf(IsStr)
                          ).lenient

I think it's self-explanatory and as it was mentioned, defining a spec is as simple as defining a Json. It's declarative and concise, with no ceremony at all.

There are a bunch of things we can do with a spec:

  • Validate a Json and get a stream with all the validation errors and their locations
    
val json = JsObj("a" -> 1,
                 "b" -> "hi", 
                 "c" -> JsArray(JsObj("d" -> "bye", 
                                      "e" -> 1)
                                )
                )

val spec = JsObjSpec("a" -> IsStr, 
                     "b" -> IsInt, 
                     "c" -> IsArrayOf(JsObjSpec("d" -> IsInstant, 
                                                "e" -> IsBool)
                                                )
                                      )

val errors: LazyList[(JsPath, Invalid)] = spec validateAll json

errors foreach println

//output 

(a,Invalid(1,SpecError(STRING_EXPECTED)))
(b,Invalid(hi,SpecError(INT_EXPECTED)))
(c / 0 / d,Invalid(bye,SpecError(INSTANT_EXPECTED)))
(c / 0 / e,Invalid(1,SpecError(BOOLEAN_EXPECTED)))
  • Validate a Json to check whether it is valid or not (not interested in the detail about all the possible errors)
val result: Result = spec.validate(json)
result match 
   case Valid => println("valid json!")
   case Invalid(value, error) => println(s"the value $value doesn conform the spec: $error")
  • Get a parser to parse strings or bytes as it was mentioned before

  • Filter a generator

val spec:JsObjSpec = ???
val gen:JsObjGen = ???

val (validDataGen, invalidDataGen) = gen.partition(spec)

or

val spec:JsObjSpec = ???
val gen:JsObjGen = ???

val xs = gen.retryUntil(spec)
val ys = gen.retryUntilNot(spec)

Reusing and composing specs is very straightforward. Spec composition is a good way of creating complex specs. You define little blocks and glue them together. Let's put an example:

val address = JsObjSpec("street" -> IsStr,
                        "number" -> IsInt,
                       )

val user = JsObjSpec("name" -> IsStr,
                     "id" -> IsStr
                    )

def userWithAddress = user concat JsObjSpec("address" -> address)

def userWithOptionalAddress = 
  (user concat JsObjSpec("address" -> address)).withOptKeys("addresss")

Generators

If you practice property-based testing and use ScalaCheck, you'll be able to design composable Json generators very quickly and naturally, as if you were writing out a Json.

Defining custom Json generators

Let's create a person generator:

import json.value.JsObj
import json.value.gen.Conversions.given
import json.value.gen.*
import org.scalacheck.Gen

def typeGen: Gen[String] = ???
def nameGen: Gen[String] = ???
def birthDateGen: Gen[String] = ???
def latitudeGen: Gen[Double] = ???
def longitudeGen: Gen[Double] = ???
def emailGen: Gen[String] = ???
def countryGen: Gen[String] = ???

def personGen:Gen[JsObj] = JsObjGen("@type" -> typeGen,
                                    "name" -> nameGen,
                                    "birth_date" -> birthDateGen,
                                    "email" -> emailGen,
                                    "gender" -> Gen.oneOf("Male",
                                                          "Female"
                                                   ),
                                     "address" -> JsObjGen("country" -> countryGen,
                                                           "location" -> TupleGen(latitudeGen,
                                                                                  longitudeGen
                                                                                   )
                                                          )
                             )

You can also create Json generators from pairs of JsPath and their generators:

JsObjGen.pairs((root / "@type" -> typeGen),
              (root / "name" -> nameGen),
              (root / "birth_date" -> birthDateGen),
              (root / "email" -> emailGen),
              (root / "gender" -> Gen.oneOf("Male","Female")),
              (root / "address" / "country" -> countryGen),
              (root / "address" / "location" / 0 -> latitudeGen),
              (root / "address" / "location" / 1 -> longitudeGen)
              )

A typical scenario is when we want some elements not to be always generated.

There are two possible solutions:

  • Use the withOptKeys function to create a new generator where the specified keys are optional.

  • Using the special value JsNothing, you can customize the probability an element will be generated with. Remember that inserting JsNothing is like removing the element located at the path. Taking that into account, let's create a generator that produces Jsons without the key name with a probability of 50 percent:

def nameGen: Gen[JsStr] = ???

def optNameGen: Gen[JsValue] = Gen.oneOf(JsNothing,nameGen)

JsObjGen("name" -> optNameGen)

And we can change that probability using the ScalaCheck function Gen.frequencies:

def nameGen: Gen[JsStr] = ???

def optNameGen: Gen[JsValue] = Gen.frequency((10,JsNothing),
                                             (90,nameGen)
                                             )

JsObjGen("name" ->  optNameGen)

Composing Json generators

Composing Json generators is key to handle complexity and avoid repetition. There are two ways, inserting pairs into generators and joining generators:

def addressGen:Gen[JsObj] = JsObjGen("street" -> streetGen,
                                     "city" -> cityGen,
                                     "zip_code" -> zipCodeGen
                                    )

def addressWithLocationGen:Gen[JsObj] = 
            addressGen updated ("location", TupleGen(latitudeGen,longitudeGen))

                                                         

def namesGen = JsObjGen("family_name" -> familyNameGen,
                        "given_name" -> givenNameGen)

def contactGen = JsObjGen("email" -> emailGen,
                          "phone" -> phoneGen,
                          "twitter_handle" -> handleGen
                         )

val clientGen = namesGen concat contactGen concat addressWithLocationGen

Installation

The library is compatible with Scala 3.1.3 or greater

libraryDependencies += "com.github.imrafaelmerino" %% "json-scala-values" % "5.2.1"

Disclaimer: I'm no longer maintain previous releases for Scala 2 and early versions of Dotty. Too much to handle for just one person...

Related projects

json-values was first developed in Java.