azapen6 / okkam-json

Yet another JSON library in Scala, that equips Akka Streams with a JSON parsing flow. Its data model is suitable for Scala's matching framework.

GitHub

Okkam JSON

Okkam JSON is an implementation of Stream Oriented JSON parser in Scala, based on Akka Streams, that processes incoming data bytes of JSONs into hierarchical modeled objects.

Okkam JSON is designed to consume JSON streams. A JSON stream is a possibly infinite sequence of JSONs which can include whitespaces and newlines at arbitrary position.

For this purpose, Akka provides a poor JSON framing which only cuts off each JSON to use a certain external JSON parser, that is quite insufficient.

A primary motivation of Okkam JSON is to furnish Akka HTTP with JSON framing and parsing. For this purpose, Okkam JSON parser works in one pass, i.e. bytes of the processed stream are only read once from head to tail (possibly infinite).

Of course, Okkam JSON can parse a single JSON. This document contains many examples of parsing a single JSON to explain value extraction.

A guide to parse JSON streams and examples are written from Consuming JSON Streams section.

Installation

Add the following to your project's build.sbt:

libraryDependencies += "com.github.azapen6" %% "okkam-json" % s"0.2.1-a${akkaVersion}"

The last fragment specifies the Akka version which Okkam JSON depends on. Okkam JSON is available for each of Akka v.{2.5.17, 2.5.18, 2.5.19}.

If you want to use Okkam JSON to consume streams from web apps which requires OAuth (or Basic) Authentication & Authorization, such as Twitter, another library Okkam HTTP offers "Akka HTTP with OAuth" that can be integrated with Okkam JSON in the same nature. Since Okkam JSON and Okkam HTTP are independent of each other, you can do it without Okkam HTTP. In this document, we use Okkam HTTP in some examples. If you use Okkam HTTP together with Okkam JSON, add the following to your project's build.sbt too:

libraryDependencies += "com.github.azapen6" %% "okkam-http" % s"0.2.3-a{akkaVersion}-h${akkaHTTPVersion}"

If you want to use rather raw Akka HTTP than Okkam HTTP, add the follwing:

lazy val akkaVersion = "2.5.19"

libraryDependencies ++= Seq
  "com.typesafe.akka" %% "akka-actor" % akkaVersion,
  "com.typesafe.akka" %% "akka-stream" % akkaVersion,
)

Documentations

Scaladoc

Parsing JSON

The first example uses uses no Akka stream API explicitly except termination block. The common framework is as follows:

import okkam.json._
import OJsonValue._

import scala.io.StdIn
import scala.util.{Success, Failure}

import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl._

object JsonExample {

  implicit val system = ActorSystem()
  implicit val materializer = ActorMaterializer()
  implicit val executionContext = system.dispatcher

  def main(args: Array[String]): Unit = {
    try {
      // put your code here

      StdIn.readLine()
    } finally {
      materializer.shutdown
      system.terminate
    }
  }
}

Our first task is to process the following simple JSON:

{
  "name": "Agu",
  "age": 30
}

For this task, OJsonParser.parseJson(str: String) method is suitable. Put the following code into the main method:

val jsonStr = """
{
  "name": "Agu",
  "age": 30
}
"""

val jsonFuture = OJsonParser.parseJson(jsonStr)

jsonFuture onComplete {
  case Success(json) =>
    import OJsonValue._ // To use the extraction DSL.

    val name = json / "name" >> ~[String]
    val age = json / "age" >> ~[Int]

    println(s"name: ${name}, next age: ${age + 1}")

  case Failure(t) => throw t
}

Then run it on your SBT shell. The output will be

name: Agu, next age: 31

In the code shown above, json / "name" returns a sub-JSON, that is the value part of the property, tagged with "name". Since the input JSON has a property

"name": "Agu"

the sub-JSON is simply the string value "Agu".

Then the operator >> ~[String] extracts the value as an instance of String.

The hierarchical notation is useful to process nested JSONs. See the following such an awful JSON:

{
  "nest1": {
    "nest2": {
      "nest3": {
        "name": "Agu",
        "age": 30
      }
    }
  }
}

You can get the value of "name" as follows:

val name = json / "nest1" / "nest2" / "nest3" / "name" >> ~[String]

It seems to be intuitive and easy to trace, I think.

You can also get the value of "age" as follows:

val age = json / "nest1" / "nest2" / "nest3" / "age" >> ~[Int]

It looks much redundant to write json / "nest1" / "nest2" / "nest3" twice. This duplicate is contracted by introduction of a sub-JSON variable:

val nest3 = json / "nest1" / "nest2" / "nest3"

Then, you can get both values as follows:

val name = nest3 / "name" >> ~[String]
val age = nest3 / "age" >> ~[Int]

Extraction by Implicit Conversions

Implicit conversions are useful to extract a lot of fields from JSONs. To use implicit conversions, import okkam.json.ImplicitConversions. The first example can be written as follows:

jsonFuture onComplete {
  case Success(json) =>
    import okkam.json.ImplicitConversions._

    val name: String = json / "name"
    val age: Int = json / "age"

    println(s"name: ${name}, next age: ${age + 1}")

  case Failure(t) => throw t
}

or

jsonFuture onComplete {
  case Success(json) =>
    import okkam.json.ImplicitConversions._

    val name = json / "name" : String
    val age = json / "age" : Int

    println(s"name: ${name}, next age: ${age + 1}")

  case Failure(t) => throw t
}

Implicit conversions are favorable when you give extracted values to methods or classes, especially in the case that a case class has a lot of fields. Here I give a couple of examples:

Provided that a method f is defined as

def f(name: String, age: Int) = s"name: ${name}, next age: ${age + 1}"

You can pass extracted values for the arguments directly:

jsonFuture onComplete {
  case Success(json) =>
    import okkam.json.ImplicitConversions._

    val str = f(json / "name", json / "age")

    println(str)

  case Failure(t) => throw t
}

You can construct a case class that keeps values of a specific JSON, for example:

case class Profile(name: String, age: Int)

You can give extracted values to the fields directly:

jsonFuture onComplete {
  case Success(json) =>
    import okkam.json.ImplicitConversions._

    val profile = Profile(json / "name", json / "age")

    println(profile)

  case Failure(t) => throw t
}

Pretty Printing

Okkam JSON provides a pretty printing method OJsonValue.toPretty that returns a pretty string.

I show a simple example below:

val jsonStr = """{"array1":[{"nest1":{"int1":123},"str1":"hello"},{"nest2":{"array2":[4,5,6],"int2":789}}],"str2": "world"}"""

val jsonFuture = OJsonParser.parseJson(jsonStr)

jsonFuture onComplete {
  case Success(json) =>
    println(json.toPretty())

  case Failure(t) => throw t
}

The output will be

{
  "array1": [
    {
      "nest1": {
        "int1": 123
      },
      "str1": "hello"
    },
    {
      "nest2": {
        "array2": [
          4,
          5,
          6
        ],
        "int2": 789
      }
    }
  ],
  "str2": "world"
}

If you prefer tabsize 4, json.toPretty(4) results the disired string.

JSON Syntax

Okkam JSON strictly obays the syntax specification of JSON.

When the parser (or the lexer) rejects the input JSON because of syntax error, it throws java.lang.IllegalArgumentException with the line the error is detected. For example, the following bad JSON causes the exception:

{
  "name": "Agu",
  "age": 30,
}

Okkam JSON does not accept any comma without succeeding element because it is prohibited in the syntax specification of JSON.

Optional Extraction

The first exapmle shows a simple way to extract values from a single JSON. It is assumed that, for every property, we know both its name and the type (e.g. string, integer, array, etc.) of its value. If we do not have full knowledge of the JSON to be processed, there exist two cases, one is that some required property is missing, the other is that the type of some value is not compatible with the expected type.

Scala's Option pattern gives one solution for this problem.

Extraction of Strings

Let me back to the first example:

val jsonStr = """
{
  "name": "Agu",
  "age": 30
}
"""

val jsonFuture = OJsonParser.parseJson(jsonStr)

jsonFuture onComplete {
  case Success(json) =>
    import OJsonValue._

    // some value extraction

  case Failure(t) => throw t
}

If you add the following into the Success block and run it:

val sex = json / "sex" >> ~[String]

the exception NoSuchElementException with "<lost at sex&rt;" will be thrown because there is no element tagged with "sex". I will explain later how to know which element is missing.

To use extraction via Option, replace >> with >>?:

val name = json / "name" >>? ~[String] match {
  case Some(s) => s // matches here
  case None => "unknown" // unreachable
}
println("name: " + name)

The output will be name: Agu because the property "name" exists and its type is String.

If there exists no matching property of the given name, it matches None. For example, if you attempt to get the value tagged with "sex":

val sex = json / "sex" >>? ~[String] match {
  case Some(s) => s // does not match
  case None => "unknown" // matches here
}
println("sex: " + sex)

the output will be

sex: unknown

Optional Extraction by Implicit Conversions

Implicit conversions provide a useful way of optional extraction. You can do it in the same way of normal implicit extraction, except that you require Option.

The previous example can be written as follows:

import okkam.json.ImplicitConversions._

val sexOption: Option[String] = json / "sex" 

val sex = sexOption match {
  case Some(s) => s // does not match
  case None => "unknown" // matches here
}
println("sex: " + sex)

or

import okkam.json.ImplicitConversions._

val sexOption = json / "sex" : Option[String]

val sex = sexOption match {
  case Some(s) => s // does not match
  case None => "unknown" // matches here
}
println("sex: " + sex)

Similar to the case of normal extraction. implicit conversions are very useful to give the extracted values to methods or classes. Here I give a example to constract a case class:

case class Profile(name: String, age: Int, sex: Option[String])

There is no difference between normal and optional extractions:

jsonFuture onComplete {
  case Success(json) =>
    import okkam.json.ImplicitConversions._

    val profile = Profile(json / "name", json / "age", json / "sex")

  case Failure(t) => throw t
}

Extraction of Integers

Similar to OJsonValue.String_, the case class OJsonValue.Int_ bears a single integer value. At first, I give a normal example:

val jsonStr = """{ "num": 3 }"""

val jsonFuture = OJsonParser.parseJson(jsonStr)

jsonFuture onComplete {
  case Success(json) =>
    val intVal = json / "num" >> ~[Int]

    println(intVal) // 3

  case Failure(t) => throw t
}

Integer value can be arbitrarily large. To bear numbers of arbitrary lengths, OJsonValue.Int_ take a type parameter of their underlying value: Int (32 bits signed), Long (64 bits signed) or BigInteger (arbitrary length).

The type of the underlying value is automatically selected by the parser when it translates a lexical number into the number value, so that the type is suitable for the underlying variable to hold the value.

Then, extraction of an integer value is somewhat different from that of a string. As >> operator can be used in the same way as extraction of string. But it can troublesome when the integer value is unbounded.

In extraction by >> operator, causes an error:

val jsonStr = """{ "num": 2147483648 }""" // Int.MaxValue + 1

val jsonFuture = OJsonParser.parseJson(jsonStr)

jsonFuture onComplete {
  case Success(json) =>
    val intVal = json / "num" >> ~[Int] // `ClassCastException` will be thrown
    println(intVal)

  case Failure(t) => throw t
}

By extraction using Option, type check and extraction are done simultaneously. Replace the Success block with the following:

json / "num" >>? ~[Int] match {
  case Some(i) => println(s"class: ${i.getClass.getName} value: $i")
  case None => println("Out of Int range")
}

The output will be Out of Int range.

Well then, replace the code with the following:

json / "num" >>? ~[Long] match {
  case Some(i) => println(s"class: ${i.getClass.getName} value: $i")
  case None => println("Out of Int range")
}

The output will be class: long, value: 2147483648.

If you replace the value of the property "long" with an integer which does not exceed the Int range, for exapmle,

val jsonStr = """{ "num": 6 }"""

the same code of extraction will result in the output class: int, value: 2147483648.

You must take boundary condition into account when you apply arithmetic operation. For example, the following code causes overflow:

val jsonStr = """{ "num": 2147483647 }""" // Int.MaxValue

val jsonFuture = OJsonParser.parseJson(jsonStr)

jsonFuture onComplete {
  case Success(json) =>
    json / "num" >>? ~[Int] match {
      case Some(i) => println("next: " + (i + 1))
      case None => println("Out of Int range")
    }

  case Failure(t) => throw t
}

The output will be next: -2147483648. It is clear that overflow occurs.

Extraction of Integers by Implicit Conversions

Okkam JSON also provides methods for extraction of integers by implicit conversions. You can do it in the similar way of extraction of strings. Type conversions from underlying value type is forced. Here I give nothing but a trivial example:

case class Data(
  intValue: Int,
  longValue: Long,
  intOption: Option[Int]
)
val jsonStr =
"""
{
  "int_val": 123,
  "long_val": 1000000000000,
  "int_option": 456
}
"""
// int_option is possibly missing or larger than Int.MaxValue.

val jsonFuture = OJsonParser.parseJson(jsonStr)

jsonFuture onComplete {
  case Success(json) =>
    import okkam.json.ImplicitConversions._

    val data = Data(
      intValue = json / "int_val_",
      longValue = json / "long_val",
      intOption = json / "int_option"
    )

  case Failure(t) => throw t
}

Extraction of Arrays

In JSON, arrays can contain any kinds of JSON values, which may be a jumble of values of different types. For example, the following expression is a valid (but unusual) JSON array:

[1, 2, "3", 3.5, [4, [5, ["6", [7]]]], { "x": 8, "y": 9 }, true, null]

The outermost array has 8 elements, i.e. the length of this array is exactly 8.

An empty array [] is also a valid array whose length is exactly 0.

In Okkam JSON, the class OJsonArray implements JSON arrays as IndexedSeq[OJsonValue]. The trait provides common operations for arrays, such as map, foreach, filter, mkString etc. For example, we can parse the exapmle array above and print each element with its type name as follows:

val jsonStr = """[1, 2, "3", 3.5, [4, [5, ["6", [7]]]], { "x": 8, "y": 9 }, true, null]"""

val jsonFuture = OJsonParser.parseJson(jsonStr)

jsonFuture onComplete {
  case Success(json) =>
    val array0 = json >> ~[OJsonArray]

    println("length: ${array0.length}, class: ${array0.getClass.getName}")
    array0 foreach { e => println(s"value: $e, class: ${e.getClass.getName}") }

  case Failure(t) => throw t
}

The output will be

length: 8, class: okkam.json.OJsonArray
value: 1, class: okkam.json.OJsonValue$Int_
value: 2, class: okkam.json.OJsonValue$Int_
value: 3, class: okkam.json.OJsonValue$String_
value: 3.5, class: okkam.json.OJsonValue$Float_
value: [4, [5, [6, [7]]]], class: okkam.json.OJsonArray
value: { "x": 8, "y": 9 }, class: okkam.json.OJsonObject
value: true, class: okkam.json.OJsonValue$Bool_
value: null, class: okkam.json.OJsonValue$Null_$

The length of the example array is 8 as expected. The actual class of each value is not the subject here. (See the Type Hierarchy section.)

As far as arrays are concerned, the important case is that an array contains values of a same type, for exapmle [1, 2, 3] contains integers only. For such uniform arrays, it is useful to map extraction of each element over the array.

Okkam JSON provides direct extraction operators >>> (equivalent to extractArray method) and >>>? (equivalent to extractArrayOption) from OJsonValue to IndexedSeq[<type>]. The difference between >>> and >>>? is similar to that of >> and >>?.

I show a simple example below:

val jsonStr = """[1, 2, 3]"""

val jsonFuture = OJsonParser.parseJson(jsonStr)

jsonFuture onComplete {
  case Success(json) =>
    val array0 = json >>> ~[Int]

    println(s"length: ${array0.length}, class: ${array0.getClass.getName}")
    array0 foreach { e => println(s"value: $e, class: ${e.getClass.getName}") }

  case Failure(t) => throw t
}

The output will be

length: 3, class scala.collection.immutable.Vector
value: 1, class: int
value: 2, class: int
value: 3, class: int

Of cource, you can use map for extraction as

json >> ~[OJsonArray] map { _ >> ~[Int] }

This looks somewhat indirect but is available to apply some conversion for each extracted values.

Extraction of JSON Object

A JSON object (the term is more or less confusing) is an associative array that contains key-value pairs, called "properties". We have already seen examples of JSON objects above, for example,

{
  "name": "Agu",
  "age": 30
}

is a single JSON object.

Similar to arrays, JSON objects can be nested and also can be included in arrays. For example, the following expression is a valid JSON object:

{
  "array1": [
    { "nest1": {
        "int1": 123
      },
      "str1": "hello"
    },
    { "nest2": {
        "array2": [4, 5, 6],
        "int2": 789
      }
    }
  ],
  "str2": "world"
}

Unlike mixed arrays, we sometimes face compleicated JSON objects like this example.

As seen before, you can access a value located in a JSON object by using / (sub-JSON) operator, if you know its structure. Complete value extraction from the last JSON object is as follows:

val array1 = json / "array1" >> ~[OJsonArray]

val nest1 = array1(0) / "nest1"
val int1 = nest1 / "int1" >> ~[Int]
val str1 = array1(0) / "str1" >> ~[String]

val nest2 = array1(1) / "nest2"
val array2 = nest2 / "array2" >>> ~[Int]
val int2 = nest2 / "int2" >> ~[Int]

val str2 = json / "str2" >> ~[String]

println("int1: " + int1)
println("str1: " + str1)
println("array2: " + array2)
println("int2: " + int2)
println("str2: " + str2)

In Okkam JSON, the class OJsonArray implements JSON object as Map[String, OJsonValue].

For example, you can get names (corresponding keys of Map) of properties as follows:

val jsonObj = json >> ~[OJsonObject]
jsonObj.keys foreach println

Type Hierarchy

In codes shown above, json and json / "sex" are instances of the type OJsonValue, moreover all of parsed JSON values are instances of OJsonValue.

OJsonValue is the root type of parsed values in Okkam JSON. It defines / operator (equivalent to subJson method), >> (equivalent to extract method) and >>? (equivalent to extractOption method) operators and toPretty method.

The type hierarchy of OJsonValue is as follows (each of .<Type> stands for OJsonValue.<Type>):

OJsonValue
    |
    |---- OJsonObject // JSON object e.g. { "key": "value" }
    |
    |---- OJsonArray // JSON array e.g. [1, 2, 3]
    |
    |---- .Lost // matches when a required element is missing.
    |
    |---- .Strict
              |
              |---- .Null_ // null
              |
              |---- .Bool_ // true or false
              |         |
              |         |---- .True_ // true
              |         |
              |         |---- .False_ // false
              |
              |---- .String_ // string value
              |
              |---- .Number // number value
                        |
                        |---- .Int_ // integer value
                        |
                        |---- .Float_ // floating-point value

Extraction via Pattern Matching

Extraction of String

You can extract values of JSONs via Scala's pattern matching along the type hierarchy of OJsonValue. For example, assume that we want to extract values of the following JSON:

{
  "str_val": "This is a string.",
  "int_val": 123,
  "long_val": 2147483648,
  "bool_true": true,
  "bool_false": false,
  "null_val": null,
  "array": [1, 2, 3]
}

If we want to extract the value of "str_val", the following matching works:

val jsonFuture = OJsonParser.parseJson(jsonStr) // the JSON above

jsonFuture onComplete {
  case Success(json) =>
    json / "str_val" match {
      case String_(s) => println("String: " + s)
      case Lost(_) => println("No match")
      case _ => throw new Exception("Unreachable")
    }

  case Failure(t) => throw t
}

The output will be

String: This is a string.

Extraction of Boolean Values

Boolean value is either true or false. The case class Bool_ matches both of them in similar way to String_:

json / "bool_true" match {
  case Bool_(b) => println("Boolean: " + b) // matches here
  case _ => println("Unreachable")
}

The output will be

Boolean: true

Since a boolean value are one of two constants, it can be tested directly as follows:

json / "bool_false" match {
  case True_() => println("Boolean: true")
  case False_() => println("Boolean: false") // matches here
  case _ => println("Unreachable")
}

Note that each void () following True_ and False_ can not be omitted.

Extraction of null

It is important that pattern matching can distinguish null from other types because null is compatible with any type. For example, the following avoids matching null with other types:

json / "null_val" match {
  case String_(s) => println("String: " + s)
  case Int_(i) => println(s"Int: $i, class: ${i.getClass.getName}")
  case Float_(f) => println(s"Float: $f, class: ${f.getClass.getName}")
  case True_() => println("Boolean: true")
  case False_() => println("Boolean: false")
  case Null_() => println("Null: null") // matches here
  case _ => println("No match")
}

The void () following Null_ can not be omitted as well as the two above.

Extraction operators converts null into any required type, for example, both the following extraction result in exactly 0 without any exception:

val i = json / "null_val" >> ~[Int]
println(s"value; $i, class: ${i.getClass.getName}")
json / "null_val" >>? ~[Int] match {
  case Some(i) => println(s"value; $i, class: ${i.getClass.getName}")
  case None => println("no match") // unreachable
}

If a certain JSON has a property which bears either a number value or null, and null is distinguished from the number 0, both the extraction above will cause undesirable result.

To deal with the case, the following works:

json / "null_val" match {
  case Int_(i) => println(s"Int: $i, class: ${i.getClass.getName}")
  case Null_() => println("Null: null") // matches here
  case _ => println("No match")
}

Extraction of Number

The last example includes a case that matches and extracts an integer value:

case Int_(i) => println(s"Int: $i, class: ${i.getClass.getName}")

If you replace "null_val" with "int_val", the output will be

value: 123, class: java.lang.Integer

If you replace "null_val" with "long_val", the output will be

value: 2147483648, class: java.lang.Long

Since the actual types are incompatible, the grammatical type of i in the case turns out to be Any, so the following will cause compilation error:

case Int_(i) =>
  val intVal: Int = i
  println(i)

As described above, you can extract Int value by >> operator:

val intVal = json / "int_val" >> ~[Int]
println(i)

but it causes an exception when the value exceeds the Int range.

val intVal = json / "long_val" >> ~[Int] // `ClassCastException` is thrown
println(i)

Pattern matching gives simple solutions to handle this problem.

One is to separate cases in actual types as follows:

json / "int_val" match {
  case Int_(i: Int) => println("Int: " + i) // matches here
  case Int_(i: Long) => println("Long: " + i)
  case Null_() => println("Null: null")
  case _ => println("No match")
}
json / "long_val" match {
  case Int_(i: Int) => println("Int: " + i)
  case Int_(i: Long) => println("Long: " + i) // matches here
  case Null_() => println("Null: null")
  case _ => println("No match")
}

If the value exceeds the Long range, it does not match the first two cases and results in no match.

Since an integer value which does not exceeds the Int range is translated into the Int value, you can not omit the first case even if you need a Long value. The next way is suitable to handle this case.

Another way is to extract the value by using >> after it matches Int_. This can be written as follows:

json / "int_val" match {
  case i: Int_[_] =>
    val intVal = i >> ~[Int]
    println("Int: " + intVal)
  case Null_() => println("Null: null")
  case _ => println("No match")
}

The wildcard [_] of Int_[_] is mandatory because Int_ takes a type parameter.

An advantage of this way is that the type of the underlying value does not influence the toplevel matching. Then, you can extract any Long value in one case.

json / "long_val" match {
  case i: Int_[_] =>
    val longVal = i >> ~[Long]
    println("Long: " + longVal)
  case Null_() => println("Null: null")
  case _ => println("No match")
}
json / "int_val" match {
  case i: Int_[_] =>
    val longVal = i >> ~[Long]
    println("Long: " + longVal)
  case Null_() => println("Null: null")
  case _ => println("No match")
}

Reading JSON from file

Assume that you have a file whose name is example1.json and whose content is the following single JSON:

{
  "name": "Hiyori",
  "age": 4
}

For reading JSON from the file, OJsonParser.parseJson(java.nio.file.Paths) method is suitable. Charset is assumed to be UTF-8. No other charset is available in the current version.

I show an example below, that is similar to the first one:

import java.nio.file.Paths

val jsonFuture = OJsonParser.parseJson(Paths.get("exapmle1.json"))

jsonFuture onComplete {
  case Success(json) =>
    import OJsonValue._ // To use the extraction DSL.

    val name = json / "name" >> ~[String]
    val age = json / "age" >> ~[Int]

    println(s"name: ${name}, next age: ${age + 1}")

  case Failure(t) => throw t
}

The output will be

name: Hiyori, next age: 5

Getting information from Twitter

The second example is to get user information from Twitter. It requires pairs of Consumer Key & Secret and Access Token & Secret (or Bearer of App-only auth) of your application.

import okkam.json._
import okkam.json.OJsonValue._

import okkam.http._
import OHttpUrl._

import scala.io.StdIn
import scala.concurrent.Future
import scala.util.{Success, Failure}

import akka.http.scaladsl.model._
import akka.stream.scaladsl._

object TwitterExample {

  val osys = OHttpSystem()
    import osys._

  val twitter = OHttpClient(
    OAuth1.KeyPair(
      "ABC...", // Consumer Key
      "DEF..."), // Consumer Secret
    OAuth1.TokenPair(
      "GHI...", // Access Token
      "JKL...")) // Access Token Secret

  def processJson(ent: HttpEntity) = {
    val parseFuture =
      OJsonParser.parseJson(ent.dataBytes) map { json =>
        val tweets = json >> ~[OJsonArray]

        tweets foreach { status =>
          val text = status / "text" >> ~[String]
          println("text: " + text)
        }

        json
      }

    parseFuture onComplete {
      case Success(_) => println("Success in parsing JSON!")
      case Failure(t) => throw t
    }
  }

  def getTweets = {
    val request = OHttpRequest.GET(
      https"api.twitter.com/1.1/statuses/user_timeline.json?count=3"
    )

    val entityFuture =
      twitter.makeRequestWithCallback(request) { res =>
        if (res.status == StatusCodes.OK)
          res.entity
        else
          throw new RuntimeException("HTTP Error: " + res.status)
    }

    entityFuture onComplete {
      case Success(ent) => processJson(ent)
      case Failure(t) => println(t)
    }
  }

  def main(args: Array[String]): Unit = {
    try {
      getTweets

      StdIn.readLine()
    } finally {
      osys.shutdown
    }
  }

}

Consuming JSON Streams

Okkam JSON is designed to consume JSON streams. A JSON stream is a possibly infinite sequence of JSONs which can include whitespaces and newlines at arbitrary position.

Okkam JSON integrates JSON framing and parsing in one pass, i.e. bytes of the processed stream are only read once.

OJsonParser.parseJsonForeach methods provide a basic function that parse each JSON one by one and pass to a callback as OJsonValue.

I show a simple and complete example that OJsonParser.parseJsonForeach parses a sequence of three JSONs and simply pretty prints them:

import okkam.json._
import OJsonValue._

import scala.io.StdIn
import scala.util.{Success, Failure}

import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl._

object JsonStreamExample {

  implicit val system = ActorSystem()
  implicit val materializer = ActorMaterializer()
  implicit val executionContext = system.dispatcher

  def main(args: Array[String]): Unit = {
    try {

    val jsonStream = """
    { "first": 1 }
    { "second":
      { "count": 2
       } }

      { "third": {
        "next": "fourth"} }


"""

    var number = 1
    val completionFuture =
      OJsonParser.parseJsonForeach(jsonStream) { json => // callback takes one `OJsonValue`
        println(s"#${number}:\n${json.toPretty()}\n")
        number += 1
      }

    completionFuture onComplete {
      case Success(_) => println("Parsing stream has successfully finished.")
      case Failure(t) => throw t
    }

      StdIn.readLine()
    } finally {
      materializer.shutdown
      system.terminate
    }
  }
}

Parsing JSON Stream from Twitter

Okkam JSON and Okkam HTTP are designed to be suitable for comsuming Twitter Stream API.

Although User and Site Streams are going to be closed, a few stream API are still left available. POST statuses/filter is one of them.

Here, we show an example to print texts of tweets that are collected by statuses/filter. It includes one advanced use of Okkam HTTP that returns HttpEntity without the entire body instead of OHttpResponse.

The following example simply shows texts of tweets which include some keywords of the given arguments:

import okkam.json._
import okkam.json.OJsonValue._

import okkam.http._
import OHttpUrl._

import scala.io.StdIn
import scala.concurrent.Future
import scala.util.{Success, Failure}

import akka.http.scaladsl.model._
import akka.stream.scaladsl._

object TwitterStreamExample {

  val osys = OHttpSystem()
    import osys._

  val twitter = OHttpClient(
    OAuth1.KeyPair(
      "ABC...", // Consumer Key
      "DEF..."), // Consumer Secret
    OAuth1.TokenPair(
      "GHI...", // Access Token
      "JKL...")) // Access Token Secret

  def processJsonStream(ent: HttpEntity) = {
    val parseFuture =
      OJsonParser.parseJsonForeach(ent.dataBytes) { json =>
        val text = json / "text" >> ~[String]
        println(text)
      }

    parseFuture onComplete {
      case Success(_) => println("\nComplete!")
      case Failure(t) => throw t
    }
  }

  def filterTweets(keywords: Seq[String]) = {
    val request = OHttpRequest.POST(
      https"stream.twitter.com/1.1/statuses/filter.json".withQuery(Seq(
        "track" -> keywords.mkString(",")
      ))
    )

    val entityFuture =
      twitter.makeRequestWithCallback(request) { res =>
        if (res.status == StatusCodes.OK)
          res.entity
        else
          throw new RuntimeException("HTTP Error: " + res.status)
    }

    entityFuture onComplete {
      case Success(ent) => processJsonStream(ent)
      case Failure(t) => println(t)
    }

  }

  def main(args: Array[String]): Unit = {
    try {
      filterTweets(args)

      StdIn.readLine()
    } finally {
      osys.shutdown
    }
  }

}

Since initial bytes of Twitter streams do not arrive immediately after the response, connection can be lost because of undesirable timeout. Then, manual setting of timeouts is required to avoid such kind of problems.

If you want to collect large or infinite numbers of tweets, you need to change max-content-length parameter for a sufficiently large number, e.g. Long.MaxValue.

Here is an example of manual settings:

val defaultSettings = osys.settings
val newSettings = defaultSettings.copy(
    timeouts = defaultSettings.timeouts.copy(
      connecting = 30.seconds,
      idle = Duration.Inf,
      receivingBody = Duration.Inf,
    ),
    maxContentLength = Long.MaxValue
  )

These settings are applied by passing newSettings to makeRequestWithCallback method as follows:

twitter.makeRequestWithCallback(request, Some(newSettings.toConnectionPoolSettings)) { res =>

If you want only a limited number of tweets, use OJsonParserFlow directly and combine it with take method, in Akka Streams' way:

parseFuture in processJsonStream is replaced by the following:

val parseFuture =
  ent.dataBytes.via(OJsonParserFlow()).take(numberOfTweets).runForeach { json =>
    val text = json / "text" >> ~[String]
    println(text)
  }