“Talk is cheap. Show me the code.” Linus Torvalds
- Introduction
- What to use json-values for and when to use it
- Code wins arguments
- JsPath
- JsValue
- Creating Jsons
- Putting data in and getting data out
- Filter, map and reduce
- Flattening a Json
- Specs
- Generators
- Installation
- Related projects
One of the essential aspects of FP is immutable data structures, better known in the FP jargon as values. It is a fact that, when possible, working with values leads to more readable code, easier to maintain, and fewer bugs. However, sometimes, it is at the cost of losing performance because the copy-on-write approach is inefficient for significant data structures. Here is where persistent data structures come into play.
Why doesn't exist a persistent Json in Scala? It's the question I asked myself when I got into FP. Since I found no answer, I decided to implement one by myself.
- You need to deal with Jsons, and you want to program following a functional style, using just functions and values.
- For those architectures that work with JSON end-to-end, it's safe and efficient to have a persistent Json. Think of actors sending JSON messages one to each other for example.
- You manipulate JSON all the time, and you'd like to do it with less ceremony. json-values is declarative and takes advantage of concepts from FP to define a powerful API.
- Generating JSON to do Property-Based-Testing is child's play with json-values.
- Generating specifications to validate JSON and parse strings or bytes very efficiently is a piece of cake.
- Simplicity matters, and I'd argue that json-values is simple.
- As Pat Helland said, Immutability Changes Everything!
import json.value.*
JsObj("name" -> JsStr("Rafael"),
"languages" -> JsArray("Java", "Scala", "Kotlin"),
"age" -> JsInt(1),
"address" -> JsObj("street" -> JsStr("Elm Street"),
"coordinates" -> JsArray(3.32, 40.4)
)
)
or using conversions:
import json.value.*
import json.value.Conversions.given
JsObj("name" -> "Rafael",
"languages" -> JsArray("Java", "Scala", "Kotlin"),
"age" -> 1,
"address" -> JsObj("street" -> "Elm Street",
"coordinates" -> JsArray(3.32, 40.4)
)
)
import json.value.spec.*
val spec =
JsObjSpec("name" -> IsStr,
"languages" -> IsArrayOf(IsStr),
"age" -> IsInt,
"address" -> JsObjSpec("street" -> IsStr,
"coordinates" -> IsTuple(IsNumber,
IsNumber
)
)
)
.withOptKeys("address")
You can customize your specs with predicates and operators:
import json.value.spec.*
val noneEmpty = IsStr(n => if n.nonEmpty then true else "empty name")
val ageSpec = IsInt(n => if n < 16 then "too young" else true)
val addressLocSpec = JsObjSpec("coordinates" -> IsTuple(IsNumber,IsNumber))
val addressFullSpec =
JsObjSpec("street" -> noneEmpty,
"number" -> noneEmpty,
"zipcode" -> noneEmpty,
"country" -> noneEmpty
)
val addressSpec = addressLocSpec or addressFullSpec
val spec =
JsObjSpec("name" -> noneEmpty,
"languages" -> IsArrayOf(noneEmpty),
"age" -> ageSpec,
"address" -> addressSpec
)
.withOptKeys("address")
val errors:LazyList[(JsPath, Invalid)] = spec.validateAll(json)
// Invalid:: (JsValue, SpecError)
As you can see, the predicates can return a string instead of a boolean to customize the error messages.
You can get a parser from a spec to parse a string or array of bytes into a Json. Most of the json-schema implementations parse the whole Json and then validate it, which is very inefficient. json-values validates each element of the Json as soon as it is parsed. Failing fast is important too!
On the other hand, it uses the library jsoniter-scala to do the parsing, which is extremely fast and has a great API.
import json.value.JsObj
import json.value.spec.parser.*
val parser:JsObjSpecParser = spec.parser
val json:JsObj = parser.parse("{}")
import json.value.gen.*
import json.value.*
val gen =
JsObjGen(name -> Gen.alphaStr.map(JsStr),
languages -> JsArrayGen.of(Gen.oneOf("scala", "java", "kotlin")).distinct,
age -> Arbitrary.arbitrary[Int].map(JsInt),
address -> JsObjGen(street -> Gen.asciiStr.map(JsStr),
coordinates -> TupleGen(Arbitrary.arbitrary[BigDecimal]
.map(JsBigDec),
Arbitrary.arbitrary[BigDecimal]
.map(JsBigDec)))
)
or using conversions to avoid writing the map method:
import json.value.gen.*
import json.value.gen.Conversions.given
val gen =
JsObjGen(name -> Gen.alphaStr,
languages -> JsArrayGen.of(Gen.oneOf("scala", "java", "kotlin")).distinct,
age -> Arbitrary.arbitrary[Int],
address -> JsObjGen(street -> Gen.asciiStr,
coordinates -> TupleGen(Arbitrary.arbitrary[BigDecimal],
Arbitrary.arbitrary[BigDecimal]))
)
When testing, it's important to generate valid and invalid data according to your specifications. Generators and specs can be used for this purpose:
import json.value.gen.*
val gen =
JsObjGen("name" -> Gen.alphaStr,
"languages" -> JsArrayGen.of(Gen.oneOf("scala", "java", "kotlin")).distinct,
"age" -> Arbitrary.arbitrary[Int],
"address" -> JsObjGen("street" -> Gen.asciiStr,
"coordinates" -> TupleGen(Arbitrary.arbitrary[BigDecimal],
Arbitrary.arbitrary[BigDecimal]
)
)
.withOptKeys("street","coordinates")
.withNullValues("street","coordinates")
)
.withOptKeys("name","languages","age","address")
.withNullValues("name","languages","age","address")
val (validGen, invalidGen) = gen.partition(spec)
Crafting safe and composable functions free of NullPointerException with optics is a piece of cake:
import json.value.*
import monocle.{Lens ,Optional}
val nameLens:Lens[JsObj,String] = JsObj.lens.str("name")
val ageLens:Lens[JsObj,Int] = JsObj.lens.int("age")
val cityOpt:Optional[JsObj,String] = JsObj.optional.str(root / "address" / "city")
val latitudeOpt:Optional[JsObj,Double] = JsObj.optional.double(root / "address" / "coordinates" / "latitude")
//let's craft a function using lenses and optionals
val fn:Function[JsObj,JsObj] =
ageLens.modify(_ + 1)
.andThen(nameLens.modify(_.trim))
.andThen(cityOpt.set("Paris"))
.andThen(latitudeLens.modify(lat => -lat))
val updated = fn(person)
Neither if-else expressions nor null checks.I'd say it's pretty expressive and concise. As you may notice, each field has defined an associated optic, and we create functions, like fn in the previous example, putting them together (composition is key to handle complexity).
Filter,map and reduce were never so easy!
These functions traverse the whole Json recursively:
json.mapKeys(_.toLowerCase)
.map(JsStr.prism.modify(_.trim))
.filter(_.noneNull)
.filterKeys(!_.startsWith("$"))
A JsPath represents the location of a specific value within a Json. It's a sequence of Position, being a position either a Key or an Index.
import json.value.JsPath
val a:JsPath = JsPath.root / "apple" / "carrot" / "melon"
val b:JsPath = JsPath.root / 0 / 1
val ahead:Position = a.head
ahead == Key("apple")
val atail:JsPath = a.tail
atail.head == Key("carrot")
atail.last == Key("melon")
val bhead:Position = b.head
bhead == Index(0)
//appending paths
val c:JsPath = a / b
c.head == Key("apple")
c.last == Index(1)
//prepending paths
val d:JsPath = a \ b
d.head == Index(0)
d.last == Key("melon")
Every element in a Json is a subtype of JsValue. There is a specific type for each value described in json.org:
- String
- Number
- Null
- JSON object
- JSON array
There are five number specializations:
- Integer
- Long
- Double
- BigDecimal
- BigInteger
json-values adds support for the Instant type. Instants are serialized into their string representation according to ISO-8601.
When it comes to the equals method, json-values is data-oriented. I mean, two JSON are equals if they represent the same piece of information. For example, the following JSONs xs and ys have values with different primitive types, and their keys don't follow the same order:
val xs = JsObj("a" -> JsInt(1000),
"b" -> JsBigDec(BigDecimal.valueOf(100_000_000_000_000L)),
"c" -> JsInstant(Instant.parse("2022-05-25T14:27:37.353Z"))
)
val ys = JsObj("b" -> JsBigInt(BigInteger.valueOf(100_000_000_000_000L)),
"a" -> JsLong(1000L),
"c" -> JsStr("2022-05-25T14:27:37.353Z")
)
Nevertheless, since both objects represent the same piece of information:
{
"a": 1000,
"b": 100000000000000,
"c": "2022-05-25T14:27:37.353Z"
}
It makes sense that both of them are equal objects. Therefore, they have the same hashcode. The best way of exploring JsValue is by applying an exhaustive pattern matching:
import json.value.*
val value: JsValue = ???
value match
case primitive: JsPrimitive => primitive match
case JsBool(b) => println("I'm a boolean")
case JsNull => println("I'm null")
case JsInstant(i) => println("I'm an instant")
case JsStr(str) => println("I'm a string")
case number: JsNumber => number match
case JsInt(i) => println("I'm an integer")
case JsDouble(d) => println("I'm a double")
case JsLong(l) => println("I'm a long")
case JsBigDec(bd) => println("I'm a big decimal")
case JsBigInt(bi) => println("I'm a big integer")
case json: Json[_] => json match
case o: JsObj => println("I'm an object")
case a: JsArray => println("I'm an array")
case JsNothing => println("I'm a special type!")
The singleton JsNothing represents nothing. It's a convenient type that makes functions that return a JsValue total on their arguments. For example, the Json function
Json :: apply(path:JsPath):JsValue
is total because it returns a JsValue for every JsPath. If there is no element located at the given path, it returns JsNothing. On the other hand, inserting JsNothing at a path is like removing the element located at the path.
There are several ways of creating Jsons:
- Using the apply methods of companion objects and passing in a map or a sequence of path/value pairs.
- Parsing an array of bytes, a string, or an input stream. When possible, it's always better to work on a byte level. On the other hand, if the schema of the Json is known, the fastest way is to define a spec.
- From an empty Json, and then using the API to insert new values.
From a Map using the arrow notation:
import json.value.JsObj
import json.value.JsArray
import json.value.Conversions.given
val person = JsObj("type" -> "Person",
"age" -> 37,
"name" -> "Rafael",
"gender" -> "MALE",
"address" -> JsObj("location" -> JsArray(40.416775,
-3.703790
)
),
"book_ids" -> JsArray("00001",
"00002"
)
)
From a sequence of path/value pairs:
import json.value.Conversions.given
JsObj.pairs((root / "type","@Person"),
(root / "age", 37),
(root / "name", "Rafael"),
(root / "gender", "MALE"),
(root / "address" / "location" / 0, 40.416775),
(root / "address" / "location" / 1, 40.416775),
(root / "books_ids" / 0, "00001"),
(root / "books_ids" / 1, "00002"),
)
Parsing a string or array of bytes, and the schema of the Json is unknown:
val str:String = ???
val bytes:Array[Byte] = ???
val a:JsObj= JsObj.parse(str)
val b:JsObj= JsObj.parse(bytes)
Parsing a string or array of bytes, and the schema of the Json is known. We can create a spec to define the structure of the Json and then get a parser:
val spec:JsObjSpec = JsObjSpec("a" -> IsInt,
"b" -> IsStr,
"c" -> IsBool,
"d" -> JsObjSpec("e" -> IsLong,
"f" -> IsTuple(IsNumber,IsNumber)
),
"e" -> IsArrayOf(IsStr)
)
val parser = spec.parser //reuse this object
From an empty Json and using the updated function of the API:
import json.value.Conversions.given
JsObj.empty.updated(root / "a" / "b" / 0, 1)
.updated(root / "a" / "b" / 1, 2)
.updated(root / "a" / "c", "hi")
From a sequence of values:
import json.value.JsArray
import json.value.Converisons.given
JsArray(1,2,3)
JsArray("a","b","c")
JsArray("a", 1, JsObj("a" -> 1, "b" -> 2), JsNull, JsArray(0,1))
From a sequence of path/value pairs:
import json.value.JsArray
import json.value.Conversions.given
JsArray((root / 0, "a"),
(root / 1, 1),
(root / 2 / "a", 1),
(root / 3, JsNull),
(root / 4 / 0, 0),
(root / 4 / 1, 1)
)
Parsing a string or array of bytes, and the schema of the Json is unknown:
import json.value.JsArray
val str:String = ???
val bytes:Array[Byte] = ???
val a = JsArray.parse(str)
val b = JsArray.parse(bytes)
Parsing a string or array of bytes, and the schema of the Json is known. We can create a spec to define the structure of the Json array:
import json.value.spec.*
import json.value.spec.parser.*
val spec = IsTuple(IsStr,
IsInt,
JsObjSpec("a" -> IsStr),
IsStr.nullable,
IsArrayOf(IsInt)
)
val parser = spec.parser //reuse this object
val str:String = ???
val bytes:Array[Byte] = ???
val a = parser.parse(str)
val b = parser.parse(bytes)
From an empty array and using the appended and prepended functions of the API:
import json.value.*
JsArray.empty.appended("a")
.appended("1")
.appended(JsObj("a" -> 1, "b" -> true))
.appended(JsNull)
.appended(JsArray(0,1))
There is one function to put data in a Json specifying a path and a value:
JsObj:: updated(path:JsPath, value:JsValue, padWith:JsValue = JsNull):JsObj
JsArray:: updated(path:JsPath, value:JsValue, padWith:JsValue = JsNull):JsArray
The updated function always inserts the value at the specified path, creating any needed container and padding arrays when necessary.
import json.value.Conversions.given
import json.value.*
// always true: if you insert a value, you'll get it back
json.updated(path, value)(path) == value
JsObj.empty.updated(root / "a", 1) == JsObj("a" -> 1)
JsObj.empty.updated(root / "a" / "b", 1) == JsObj("a" -> JsObj("b" -> 1))
//padding array with emtpy strings
JsObj.empty.updated(root / "a" / 2, "z", padWith="") == JsObj("a" -> JsArray("","","z"))
New elements can be appended and prepended to a JsArray:
appended(value:JsValue):JsArray
prepended(value:JsValue):JsArray
appendedAll(xs:IterableOne[JsValue]):JsArray
prependedAll(xs:IterableOne[JsValue]):JsArray
The functions filter, filterKeys, map, mapKeys, and reduce traverse the whole json recursively. All these functions are functors (don't change the structure of the Json).
val json = ???
val toLowerCase:String => String = _.toLowerCase
json mapKeys toLowerCase
json map JsStr.prism.modify(_.trim)
val isNotNull:JsPrimitive => Boolean = _.noneNull
json filter isNotNull
A Json can be seen as a set of (JsPath,JsValue) pairs. The flatten function returns a lazy list of pairs:
Json:: flatten:LazyList[(JsPath,JsValue)]
Returning a lazy list decouples the consumers from the producer. No matter the number of pairs that will be consumed, the implementation doesn't change.
Let's put an example:
val obj = JsObj("a" -> 1,
"b" -> JsArray(1,"m", JsObj("c" -> true, "d" -> JsObj.empty))
)
obj.flatten.foreach(println) // all the pairs are consumed
// (a, 1)
// (b / 0, 1)
// (b / 1, "m")
// (b / 2 / c, true)
// (b / 2 / d, {})
A Json spec defines the structure of a Json. Specs have attractive qualities like:
- Easy to write. You can define Specs in the same way you define a raw Json.
- Easy to compose. You glue specs together and create new ones easily.
- Easy to extend. There are predefined specs that cover the most common scenarios. Nevertheless, you can create any imaginable spec from predicates.
Let's go straight to the point and put an example:
import json.value.spec.*
val personSpec = JsObjSpec("@type" -> IsCons("Person"),
"age" -> IsInt,
"name" -> IsStr,
"gender" -> IsEnum("MALE","FEMALE"),
"address" -> JsObjSpec("location" -> IsTuple(IsNumber,
IsNumber
)
),
"books_id" -> IsArrayOf(IsStr)
).lenient
I think it's self-explanatory and as it was mentioned, defining a spec is as simple as defining a Json. It's declarative and concise, with no ceremony at all.
There are a bunch of things we can do with a spec:
- Validate a Json and get a stream with all the validation errors and their locations
val json = JsObj("a" -> 1,
"b" -> "hi",
"c" -> JsArray(JsObj("d" -> "bye",
"e" -> 1)
)
)
val spec = JsObjSpec("a" -> IsStr,
"b" -> IsInt,
"c" -> IsArrayOf(JsObjSpec("d" -> IsInstant,
"e" -> IsBool)
)
)
val errors: LazyList[(JsPath, Invalid)] = spec validateAll json
errors foreach println
//output
(a,Invalid(1,SpecError(STRING_EXPECTED)))
(b,Invalid(hi,SpecError(INT_EXPECTED)))
(c / 0 / d,Invalid(bye,SpecError(INSTANT_EXPECTED)))
(c / 0 / e,Invalid(1,SpecError(BOOLEAN_EXPECTED)))
- Validate a Json to check whether it is valid or not (not interested in the detail about all the possible errors)
val result: Result = spec.validate(json)
result match
case Valid => println("valid json!")
case Invalid(value, error) => println(s"the value $value doesn conform the spec: $error")
-
Get a parser to parse strings or bytes as it was mentioned before
-
Filter a generator
val spec:JsObjSpec = ???
val gen:JsObjGen = ???
val (validDataGen, invalidDataGen) = gen.partition(spec)
or
val spec:JsObjSpec = ???
val gen:JsObjGen = ???
val xs = gen.retryUntil(spec)
val ys = gen.retryUntilNot(spec)
Reusing and composing specs is very straightforward. Spec composition is a good way of creating complex specs. You define little blocks and glue them together. Let's put an example:
val address = JsObjSpec("street" -> IsStr,
"number" -> IsInt,
)
val user = JsObjSpec("name" -> IsStr,
"id" -> IsStr
)
def userWithAddress = user concat JsObjSpec("address" -> address)
def userWithOptionalAddress =
(user concat JsObjSpec("address" -> address)).withOptKeys("addresss")
If you practice property-based testing and use ScalaCheck, you'll be able to design composable Json generators very quickly and naturally, as if you were writing out a Json.
Let's create a person generator:
import json.value.JsObj
import json.value.gen.Conversions.given
import json.value.gen.*
import org.scalacheck.Gen
def typeGen: Gen[String] = ???
def nameGen: Gen[String] = ???
def birthDateGen: Gen[String] = ???
def latitudeGen: Gen[Double] = ???
def longitudeGen: Gen[Double] = ???
def emailGen: Gen[String] = ???
def countryGen: Gen[String] = ???
def personGen:Gen[JsObj] = JsObjGen("@type" -> typeGen,
"name" -> nameGen,
"birth_date" -> birthDateGen,
"email" -> emailGen,
"gender" -> Gen.oneOf("Male",
"Female"
),
"address" -> JsObjGen("country" -> countryGen,
"location" -> TupleGen(latitudeGen,
longitudeGen
)
)
)
You can also create Json generators from pairs of JsPath and their generators:
JsObjGen.pairs((root / "@type" -> typeGen),
(root / "name" -> nameGen),
(root / "birth_date" -> birthDateGen),
(root / "email" -> emailGen),
(root / "gender" -> Gen.oneOf("Male","Female")),
(root / "address" / "country" -> countryGen),
(root / "address" / "location" / 0 -> latitudeGen),
(root / "address" / "location" / 1 -> longitudeGen)
)
A typical scenario is when we want some elements not to be always generated.
There are two possible solutions:
-
Use the withOptKeys function to create a new generator where the specified keys are optional.
-
Using the special value JsNothing, you can customize the probability an element will be generated with. Remember that inserting JsNothing is like removing the element located at the path. Taking that into account, let's create a generator that produces Jsons without the key name with a probability of 50 percent:
def nameGen: Gen[JsStr] = ???
def optNameGen: Gen[JsValue] = Gen.oneOf(JsNothing,nameGen)
JsObjGen("name" -> optNameGen)
And we can change that probability using the ScalaCheck function Gen.frequencies:
def nameGen: Gen[JsStr] = ???
def optNameGen: Gen[JsValue] = Gen.frequency((10,JsNothing),
(90,nameGen)
)
JsObjGen("name" -> optNameGen)
Composing Json generators is key to handle complexity and avoid repetition. There are two ways, inserting pairs into generators and joining generators:
def addressGen:Gen[JsObj] = JsObjGen("street" -> streetGen,
"city" -> cityGen,
"zip_code" -> zipCodeGen
)
def addressWithLocationGen:Gen[JsObj] =
addressGen updated ("location", TupleGen(latitudeGen,longitudeGen))
def namesGen = JsObjGen("family_name" -> familyNameGen,
"given_name" -> givenNameGen)
def contactGen = JsObjGen("email" -> emailGen,
"phone" -> phoneGen,
"twitter_handle" -> handleGen
)
val clientGen = namesGen concat contactGen concat addressWithLocationGen
The library is compatible with Scala 3.1.3 or greater
libraryDependencies += "com.github.imrafaelmerino" %% "json-scala-values" % "5.2.1"
Disclaimer: I'm no longer maintain previous releases for Scala 2 and early versions of Dotty. Too much to handle for just one person...
json-values was first developed in Java.