Okkam JSON
Okkam JSON is an implementation of Stream Oriented JSON parser in Scala, based on Akka Streams, that processes incoming data bytes of JSONs into hierarchical modeled objects.
Okkam JSON is designed to consume JSON streams. A JSON stream is a possibly infinite sequence of JSONs which can include whitespaces and newlines at arbitrary position.
For this purpose, Akka provides a poor JSON framing which only cuts off each JSON to use a certain external JSON parser, that is quite insufficient.
A primary motivation of Okkam JSON is to furnish Akka HTTP with JSON framing and parsing. For this purpose, Okkam JSON parser works in one pass, i.e. bytes of the processed stream are only read once from head to tail (possibly infinite).
Of course, Okkam JSON can parse a single JSON. This document contains many examples of parsing a single JSON to explain value extraction.
A guide to parse JSON streams and examples are written from Consuming JSON Streams section.
Installation
Add the following to your project's build.sbt
:
libraryDependencies += "com.github.azapen6" %% "okkam-json" % s"0.2.1-a${akkaVersion}"
The last fragment specifies the Akka version which Okkam JSON depends on. Okkam JSON is available for each of Akka v.{2.5.17, 2.5.18, 2.5.19}.
If you want to use Okkam JSON to consume streams from web apps which requires OAuth (or Basic) Authentication & Authorization, such as Twitter, another library Okkam HTTP offers "Akka HTTP with OAuth" that can be integrated with Okkam JSON in the same nature. Since Okkam JSON and Okkam HTTP are independent of each other, you can do it without Okkam HTTP. In this document, we use Okkam HTTP in some examples. If you use Okkam HTTP together with Okkam JSON, add the following to your project's build.sbt
too:
libraryDependencies += "com.github.azapen6" %% "okkam-http" % s"0.2.3-a{akkaVersion}-h${akkaHTTPVersion}"
If you want to use rather raw Akka HTTP than Okkam HTTP, add the follwing:
lazy val akkaVersion = "2.5.19"
libraryDependencies ++= Seq
"com.typesafe.akka" %% "akka-actor" % akkaVersion,
"com.typesafe.akka" %% "akka-stream" % akkaVersion,
)
Documentations
Parsing JSON
The first example uses uses no Akka stream API explicitly except termination block. The common framework is as follows:
import okkam.json._
import OJsonValue._
import scala.io.StdIn
import scala.util.{Success, Failure}
import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl._
object JsonExample {
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
def main(args: Array[String]): Unit = {
try {
// put your code here
StdIn.readLine()
} finally {
materializer.shutdown
system.terminate
}
}
}
Our first task is to process the following simple JSON:
{
"name": "Agu",
"age": 30
}
For this task, OJsonParser.parseJson(str: String)
method is suitable. Put the following code into the main
method:
val jsonStr = """
{
"name": "Agu",
"age": 30
}
"""
val jsonFuture = OJsonParser.parseJson(jsonStr)
jsonFuture onComplete {
case Success(json) =>
import OJsonValue._ // To use the extraction DSL.
val name = json / "name" >> ~[String]
val age = json / "age" >> ~[Int]
println(s"name: ${name}, next age: ${age + 1}")
case Failure(t) => throw t
}
Then run it on your SBT shell. The output will be
name: Agu, next age: 31
In the code shown above, json / "name"
returns a sub-JSON, that is the value part of the property, tagged with "name". Since the input JSON has a property
"name": "Agu"
the sub-JSON is simply the string value "Agu".
Then the operator >> ~[String]
extracts the value as an instance of String
.
The hierarchical notation is useful to process nested JSONs. See the following such an awful JSON:
{
"nest1": {
"nest2": {
"nest3": {
"name": "Agu",
"age": 30
}
}
}
}
You can get the value of "name" as follows:
val name = json / "nest1" / "nest2" / "nest3" / "name" >> ~[String]
It seems to be intuitive and easy to trace, I think.
You can also get the value of "age" as follows:
val age = json / "nest1" / "nest2" / "nest3" / "age" >> ~[Int]
It looks much redundant to write json / "nest1" / "nest2" / "nest3"
twice. This duplicate is contracted by introduction of a sub-JSON variable:
val nest3 = json / "nest1" / "nest2" / "nest3"
Then, you can get both values as follows:
val name = nest3 / "name" >> ~[String]
val age = nest3 / "age" >> ~[Int]
Extraction by Implicit Conversions
Implicit conversions are useful to extract a lot of fields from JSONs. To use implicit conversions, import okkam.json.ImplicitConversions
. The first example can be written as follows:
jsonFuture onComplete {
case Success(json) =>
import okkam.json.ImplicitConversions._
val name: String = json / "name"
val age: Int = json / "age"
println(s"name: ${name}, next age: ${age + 1}")
case Failure(t) => throw t
}
or
jsonFuture onComplete {
case Success(json) =>
import okkam.json.ImplicitConversions._
val name = json / "name" : String
val age = json / "age" : Int
println(s"name: ${name}, next age: ${age + 1}")
case Failure(t) => throw t
}
Implicit conversions are favorable when you give extracted values to methods or classes, especially in the case that a case class has a lot of fields. Here I give a couple of examples:
Provided that a method f
is defined as
def f(name: String, age: Int) = s"name: ${name}, next age: ${age + 1}"
You can pass extracted values for the arguments directly:
jsonFuture onComplete {
case Success(json) =>
import okkam.json.ImplicitConversions._
val str = f(json / "name", json / "age")
println(str)
case Failure(t) => throw t
}
You can construct a case class that keeps values of a specific JSON, for example:
case class Profile(name: String, age: Int)
You can give extracted values to the fields directly:
jsonFuture onComplete {
case Success(json) =>
import okkam.json.ImplicitConversions._
val profile = Profile(json / "name", json / "age")
println(profile)
case Failure(t) => throw t
}
Pretty Printing
Okkam JSON provides a pretty printing method OJsonValue.toPretty
that returns a pretty string.
I show a simple example below:
val jsonStr = """{"array1":[{"nest1":{"int1":123},"str1":"hello"},{"nest2":{"array2":[4,5,6],"int2":789}}],"str2": "world"}"""
val jsonFuture = OJsonParser.parseJson(jsonStr)
jsonFuture onComplete {
case Success(json) =>
println(json.toPretty())
case Failure(t) => throw t
}
The output will be
{
"array1": [
{
"nest1": {
"int1": 123
},
"str1": "hello"
},
{
"nest2": {
"array2": [
4,
5,
6
],
"int2": 789
}
}
],
"str2": "world"
}
If you prefer tabsize 4, json.toPretty(4)
results the disired string.
JSON Syntax
Okkam JSON strictly obays the syntax specification of JSON.
When the parser (or the lexer) rejects the input JSON because of syntax error, it throws java.lang.IllegalArgumentException
with the line the error is detected. For example, the following bad JSON causes the exception:
{
"name": "Agu",
"age": 30,
}
Okkam JSON does not accept any comma without succeeding element because it is prohibited in the syntax specification of JSON.
Optional Extraction
The first exapmle shows a simple way to extract values from a single JSON. It is assumed that, for every property, we know both its name and the type (e.g. string, integer, array, etc.) of its value. If we do not have full knowledge of the JSON to be processed, there exist two cases, one is that some required property is missing, the other is that the type of some value is not compatible with the expected type.
Scala's Option
pattern gives one solution for this problem.
Extraction of Strings
Let me back to the first example:
val jsonStr = """
{
"name": "Agu",
"age": 30
}
"""
val jsonFuture = OJsonParser.parseJson(jsonStr)
jsonFuture onComplete {
case Success(json) =>
import OJsonValue._
// some value extraction
case Failure(t) => throw t
}
If you add the following into the Success
block and run it:
val sex = json / "sex" >> ~[String]
the exception NoSuchElementException
with "<lost at sex&rt;" will be thrown because there is no element tagged with "sex". I will explain later how to know which element is missing.
To use extraction via Option
, replace >>
with >>?
:
val name = json / "name" >>? ~[String] match {
case Some(s) => s // matches here
case None => "unknown" // unreachable
}
println("name: " + name)
The output will be name: Agu
because the property "name" exists and its type is String
.
If there exists no matching property of the given name, it matches None
. For example, if you attempt to get the value tagged with "sex":
val sex = json / "sex" >>? ~[String] match {
case Some(s) => s // does not match
case None => "unknown" // matches here
}
println("sex: " + sex)
the output will be
sex: unknown
Optional Extraction by Implicit Conversions
Implicit conversions provide a useful way of optional extraction. You can do it in the same way of normal implicit extraction, except that you require Option
.
The previous example can be written as follows:
import okkam.json.ImplicitConversions._
val sexOption: Option[String] = json / "sex"
val sex = sexOption match {
case Some(s) => s // does not match
case None => "unknown" // matches here
}
println("sex: " + sex)
or
import okkam.json.ImplicitConversions._
val sexOption = json / "sex" : Option[String]
val sex = sexOption match {
case Some(s) => s // does not match
case None => "unknown" // matches here
}
println("sex: " + sex)
Similar to the case of normal extraction. implicit conversions are very useful to give the extracted values to methods or classes. Here I give a example to constract a case class:
case class Profile(name: String, age: Int, sex: Option[String])
There is no difference between normal and optional extractions:
jsonFuture onComplete {
case Success(json) =>
import okkam.json.ImplicitConversions._
val profile = Profile(json / "name", json / "age", json / "sex")
case Failure(t) => throw t
}
Extraction of Integers
Similar to OJsonValue.String_
, the case class OJsonValue.Int_
bears a single integer value. At first, I give a normal example:
val jsonStr = """{ "num": 3 }"""
val jsonFuture = OJsonParser.parseJson(jsonStr)
jsonFuture onComplete {
case Success(json) =>
val intVal = json / "num" >> ~[Int]
println(intVal) // 3
case Failure(t) => throw t
}
Integer value can be arbitrarily large. To bear numbers of arbitrary lengths, OJsonValue.Int_
take a type parameter of their underlying value: Int
(32 bits signed), Long
(64 bits signed) or BigInteger
(arbitrary length).
The type of the underlying value is automatically selected by the parser when it translates a lexical number into the number value, so that the type is suitable for the underlying variable to hold the value.
Then, extraction of an integer value is somewhat different from that of a string. As >>
operator can be used in the same way as extraction of string. But it can troublesome when the integer value is unbounded.
In extraction by >>
operator, causes an error:
val jsonStr = """{ "num": 2147483648 }""" // Int.MaxValue + 1
val jsonFuture = OJsonParser.parseJson(jsonStr)
jsonFuture onComplete {
case Success(json) =>
val intVal = json / "num" >> ~[Int] // `ClassCastException` will be thrown
println(intVal)
case Failure(t) => throw t
}
By extraction using Option
, type check and extraction are done simultaneously. Replace the Success
block with the following:
json / "num" >>? ~[Int] match {
case Some(i) => println(s"class: ${i.getClass.getName} value: $i")
case None => println("Out of Int range")
}
The output will be Out of Int range
.
Well then, replace the code with the following:
json / "num" >>? ~[Long] match {
case Some(i) => println(s"class: ${i.getClass.getName} value: $i")
case None => println("Out of Int range")
}
The output will be class: long, value: 2147483648
.
If you replace the value of the property "long" with an integer which does not exceed the Int
range, for exapmle,
val jsonStr = """{ "num": 6 }"""
the same code of extraction will result in the output class: int, value: 2147483648
.
You must take boundary condition into account when you apply arithmetic operation. For example, the following code causes overflow:
val jsonStr = """{ "num": 2147483647 }""" // Int.MaxValue
val jsonFuture = OJsonParser.parseJson(jsonStr)
jsonFuture onComplete {
case Success(json) =>
json / "num" >>? ~[Int] match {
case Some(i) => println("next: " + (i + 1))
case None => println("Out of Int range")
}
case Failure(t) => throw t
}
The output will be next: -2147483648
. It is clear that overflow occurs.
Extraction of Integers by Implicit Conversions
Okkam JSON also provides methods for extraction of integers by implicit conversions. You can do it in the similar way of extraction of strings. Type conversions from underlying value type is forced. Here I give nothing but a trivial example:
case class Data(
intValue: Int,
longValue: Long,
intOption: Option[Int]
)
val jsonStr =
"""
{
"int_val": 123,
"long_val": 1000000000000,
"int_option": 456
}
"""
// int_option is possibly missing or larger than Int.MaxValue.
val jsonFuture = OJsonParser.parseJson(jsonStr)
jsonFuture onComplete {
case Success(json) =>
import okkam.json.ImplicitConversions._
val data = Data(
intValue = json / "int_val_",
longValue = json / "long_val",
intOption = json / "int_option"
)
case Failure(t) => throw t
}
Extraction of Arrays
In JSON, arrays can contain any kinds of JSON values, which may be a jumble of values of different types. For example, the following expression is a valid (but unusual) JSON array:
[1, 2, "3", 3.5, [4, [5, ["6", [7]]]], { "x": 8, "y": 9 }, true, null]
The outermost array has 8 elements, i.e. the length of this array is exactly 8.
An empty array []
is also a valid array whose length is exactly 0.
In Okkam JSON, the class OJsonArray
implements JSON arrays as IndexedSeq[OJsonValue]
. The trait provides common operations for arrays, such as map
, foreach
, filter
, mkString
etc. For example, we can parse the exapmle array above and print each element with its type name as follows:
val jsonStr = """[1, 2, "3", 3.5, [4, [5, ["6", [7]]]], { "x": 8, "y": 9 }, true, null]"""
val jsonFuture = OJsonParser.parseJson(jsonStr)
jsonFuture onComplete {
case Success(json) =>
val array0 = json >> ~[OJsonArray]
println("length: ${array0.length}, class: ${array0.getClass.getName}")
array0 foreach { e => println(s"value: $e, class: ${e.getClass.getName}") }
case Failure(t) => throw t
}
The output will be
length: 8, class: okkam.json.OJsonArray
value: 1, class: okkam.json.OJsonValue$Int_
value: 2, class: okkam.json.OJsonValue$Int_
value: 3, class: okkam.json.OJsonValue$String_
value: 3.5, class: okkam.json.OJsonValue$Float_
value: [4, [5, [6, [7]]]], class: okkam.json.OJsonArray
value: { "x": 8, "y": 9 }, class: okkam.json.OJsonObject
value: true, class: okkam.json.OJsonValue$Bool_
value: null, class: okkam.json.OJsonValue$Null_$
The length of the example array is 8 as expected. The actual class of each value is not the subject here. (See the Type Hierarchy section.)
As far as arrays are concerned, the important case is that an array contains values of a same type, for exapmle [1, 2, 3]
contains integers only. For such uniform arrays, it is useful to map extraction of each element over the array.
Okkam JSON provides direct extraction operators >>>
(equivalent to extractArray
method) and >>>?
(equivalent to extractArrayOption
) from OJsonValue
to IndexedSeq[<type>]
. The difference between >>>
and >>>?
is similar to that of >>
and >>?
.
I show a simple example below:
val jsonStr = """[1, 2, 3]"""
val jsonFuture = OJsonParser.parseJson(jsonStr)
jsonFuture onComplete {
case Success(json) =>
val array0 = json >>> ~[Int]
println(s"length: ${array0.length}, class: ${array0.getClass.getName}")
array0 foreach { e => println(s"value: $e, class: ${e.getClass.getName}") }
case Failure(t) => throw t
}
The output will be
length: 3, class scala.collection.immutable.Vector
value: 1, class: int
value: 2, class: int
value: 3, class: int
Of cource, you can use map
for extraction as
json >> ~[OJsonArray] map { _ >> ~[Int] }
This looks somewhat indirect but is available to apply some conversion for each extracted values.
Extraction of JSON Object
A JSON object (the term is more or less confusing) is an associative array that contains key-value pairs, called "properties". We have already seen examples of JSON objects above, for example,
{
"name": "Agu",
"age": 30
}
is a single JSON object.
Similar to arrays, JSON objects can be nested and also can be included in arrays. For example, the following expression is a valid JSON object:
{
"array1": [
{ "nest1": {
"int1": 123
},
"str1": "hello"
},
{ "nest2": {
"array2": [4, 5, 6],
"int2": 789
}
}
],
"str2": "world"
}
Unlike mixed arrays, we sometimes face compleicated JSON objects like this example.
As seen before, you can access a value located in a JSON object by using /
(sub-JSON) operator, if you know its structure. Complete value extraction from the last JSON object is as follows:
val array1 = json / "array1" >> ~[OJsonArray]
val nest1 = array1(0) / "nest1"
val int1 = nest1 / "int1" >> ~[Int]
val str1 = array1(0) / "str1" >> ~[String]
val nest2 = array1(1) / "nest2"
val array2 = nest2 / "array2" >>> ~[Int]
val int2 = nest2 / "int2" >> ~[Int]
val str2 = json / "str2" >> ~[String]
println("int1: " + int1)
println("str1: " + str1)
println("array2: " + array2)
println("int2: " + int2)
println("str2: " + str2)
In Okkam JSON, the class OJsonArray
implements JSON object as Map[String, OJsonValue]
.
For example, you can get names (corresponding keys
of Map
) of properties as follows:
val jsonObj = json >> ~[OJsonObject]
jsonObj.keys foreach println
Type Hierarchy
In codes shown above, json
and json / "sex"
are instances of the type OJsonValue
, moreover all of parsed JSON values are instances of OJsonValue
.
OJsonValue
is the root type of parsed values in Okkam JSON. It defines /
operator (equivalent to subJson
method), >>
(equivalent to extract
method) and >>?
(equivalent to extractOption
method) operators and toPretty
method.
The type hierarchy of OJsonValue
is as follows (each of .<Type>
stands for OJsonValue.<Type>
):
OJsonValue
|
|---- OJsonObject // JSON object e.g. { "key": "value" }
|
|---- OJsonArray // JSON array e.g. [1, 2, 3]
|
|---- .Lost // matches when a required element is missing.
|
|---- .Strict
|
|---- .Null_ // null
|
|---- .Bool_ // true or false
| |
| |---- .True_ // true
| |
| |---- .False_ // false
|
|---- .String_ // string value
|
|---- .Number // number value
|
|---- .Int_ // integer value
|
|---- .Float_ // floating-point value
Extraction via Pattern Matching
Extraction of String
You can extract values of JSONs via Scala's pattern matching along the type hierarchy of OJsonValue
. For example, assume that we want to extract values of the following JSON:
{
"str_val": "This is a string.",
"int_val": 123,
"long_val": 2147483648,
"bool_true": true,
"bool_false": false,
"null_val": null,
"array": [1, 2, 3]
}
If we want to extract the value of "str_val", the following matching works:
val jsonFuture = OJsonParser.parseJson(jsonStr) // the JSON above
jsonFuture onComplete {
case Success(json) =>
json / "str_val" match {
case String_(s) => println("String: " + s)
case Lost(_) => println("No match")
case _ => throw new Exception("Unreachable")
}
case Failure(t) => throw t
}
The output will be
String: This is a string.
Extraction of Boolean Values
Boolean value is either true
or false
. The case class Bool_
matches both of them in similar way to String_
:
json / "bool_true" match {
case Bool_(b) => println("Boolean: " + b) // matches here
case _ => println("Unreachable")
}
The output will be
Boolean: true
Since a boolean value are one of two constants, it can be tested directly as follows:
json / "bool_false" match {
case True_() => println("Boolean: true")
case False_() => println("Boolean: false") // matches here
case _ => println("Unreachable")
}
Note that each void ()
following True_
and False_
can not be omitted.
null
Extraction of It is important that pattern matching can distinguish null
from other types because null
is compatible with any type. For example, the following avoids matching null
with other types:
json / "null_val" match {
case String_(s) => println("String: " + s)
case Int_(i) => println(s"Int: $i, class: ${i.getClass.getName}")
case Float_(f) => println(s"Float: $f, class: ${f.getClass.getName}")
case True_() => println("Boolean: true")
case False_() => println("Boolean: false")
case Null_() => println("Null: null") // matches here
case _ => println("No match")
}
The void ()
following Null_
can not be omitted as well as the two above.
Extraction operators converts null
into any required type, for example, both the following extraction result in exactly 0 without any exception:
val i = json / "null_val" >> ~[Int]
println(s"value; $i, class: ${i.getClass.getName}")
json / "null_val" >>? ~[Int] match {
case Some(i) => println(s"value; $i, class: ${i.getClass.getName}")
case None => println("no match") // unreachable
}
If a certain JSON has a property which bears either a number value or null
, and null
is distinguished from the number 0, both the extraction above will cause undesirable result.
To deal with the case, the following works:
json / "null_val" match {
case Int_(i) => println(s"Int: $i, class: ${i.getClass.getName}")
case Null_() => println("Null: null") // matches here
case _ => println("No match")
}
Extraction of Number
The last example includes a case
that matches and extracts an integer value:
case Int_(i) => println(s"Int: $i, class: ${i.getClass.getName}")
If you replace "null_val" with "int_val", the output will be
value: 123, class: java.lang.Integer
If you replace "null_val" with "long_val", the output will be
value: 2147483648, class: java.lang.Long
Since the actual types are incompatible, the grammatical type of i
in the case
turns out to be Any
, so the following will cause compilation error:
case Int_(i) =>
val intVal: Int = i
println(i)
As described above, you can extract Int
value by >>
operator:
val intVal = json / "int_val" >> ~[Int]
println(i)
but it causes an exception when the value exceeds the Int
range.
val intVal = json / "long_val" >> ~[Int] // `ClassCastException` is thrown
println(i)
Pattern matching gives simple solutions to handle this problem.
One is to separate cases in actual types as follows:
json / "int_val" match {
case Int_(i: Int) => println("Int: " + i) // matches here
case Int_(i: Long) => println("Long: " + i)
case Null_() => println("Null: null")
case _ => println("No match")
}
json / "long_val" match {
case Int_(i: Int) => println("Int: " + i)
case Int_(i: Long) => println("Long: " + i) // matches here
case Null_() => println("Null: null")
case _ => println("No match")
}
If the value exceeds the Long
range, it does not match the first two cases and results in no match.
Since an integer value which does not exceeds the Int
range is translated into the Int
value, you can not omit the first case even if you need a Long
value. The next way is suitable to handle this case.
Another way is to extract the value by using >>
after it matches Int_
. This can be written as follows:
json / "int_val" match {
case i: Int_[_] =>
val intVal = i >> ~[Int]
println("Int: " + intVal)
case Null_() => println("Null: null")
case _ => println("No match")
}
The wildcard [_]
of Int_[_]
is mandatory because Int_
takes a type parameter.
An advantage of this way is that the type of the underlying value does not influence the toplevel matching. Then, you can extract any Long
value in one case
.
json / "long_val" match {
case i: Int_[_] =>
val longVal = i >> ~[Long]
println("Long: " + longVal)
case Null_() => println("Null: null")
case _ => println("No match")
}
json / "int_val" match {
case i: Int_[_] =>
val longVal = i >> ~[Long]
println("Long: " + longVal)
case Null_() => println("Null: null")
case _ => println("No match")
}
Reading JSON from file
Assume that you have a file whose name is example1.json
and whose content is the following single JSON:
{
"name": "Hiyori",
"age": 4
}
For reading JSON from the file, OJsonParser.parseJson(java.nio.file.Paths)
method is suitable. Charset is assumed to be UTF-8. No other charset is available in the current version.
I show an example below, that is similar to the first one:
import java.nio.file.Paths
val jsonFuture = OJsonParser.parseJson(Paths.get("exapmle1.json"))
jsonFuture onComplete {
case Success(json) =>
import OJsonValue._ // To use the extraction DSL.
val name = json / "name" >> ~[String]
val age = json / "age" >> ~[Int]
println(s"name: ${name}, next age: ${age + 1}")
case Failure(t) => throw t
}
The output will be
name: Hiyori, next age: 5
Getting information from Twitter
The second example is to get user information from Twitter. It requires pairs of Consumer Key & Secret and Access Token & Secret (or Bearer of App-only auth) of your application.
import okkam.json._
import okkam.json.OJsonValue._
import okkam.http._
import OHttpUrl._
import scala.io.StdIn
import scala.concurrent.Future
import scala.util.{Success, Failure}
import akka.http.scaladsl.model._
import akka.stream.scaladsl._
object TwitterExample {
val osys = OHttpSystem()
import osys._
val twitter = OHttpClient(
OAuth1.KeyPair(
"ABC...", // Consumer Key
"DEF..."), // Consumer Secret
OAuth1.TokenPair(
"GHI...", // Access Token
"JKL...")) // Access Token Secret
def processJson(ent: HttpEntity) = {
val parseFuture =
OJsonParser.parseJson(ent.dataBytes) map { json =>
val tweets = json >> ~[OJsonArray]
tweets foreach { status =>
val text = status / "text" >> ~[String]
println("text: " + text)
}
json
}
parseFuture onComplete {
case Success(_) => println("Success in parsing JSON!")
case Failure(t) => throw t
}
}
def getTweets = {
val request = OHttpRequest.GET(
https"api.twitter.com/1.1/statuses/user_timeline.json?count=3"
)
val entityFuture =
twitter.makeRequestWithCallback(request) { res =>
if (res.status == StatusCodes.OK)
res.entity
else
throw new RuntimeException("HTTP Error: " + res.status)
}
entityFuture onComplete {
case Success(ent) => processJson(ent)
case Failure(t) => println(t)
}
}
def main(args: Array[String]): Unit = {
try {
getTweets
StdIn.readLine()
} finally {
osys.shutdown
}
}
}
Consuming JSON Streams
Okkam JSON is designed to consume JSON streams. A JSON stream is a possibly infinite sequence of JSONs which can include whitespaces and newlines at arbitrary position.
Okkam JSON integrates JSON framing and parsing in one pass, i.e. bytes of the processed stream are only read once.
OJsonParser.parseJsonForeach
methods provide a basic function that parse each JSON one by one and pass to a callback as OJsonValue
.
I show a simple and complete example that OJsonParser.parseJsonForeach
parses a sequence of three JSONs and simply pretty prints them:
import okkam.json._
import OJsonValue._
import scala.io.StdIn
import scala.util.{Success, Failure}
import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl._
object JsonStreamExample {
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
def main(args: Array[String]): Unit = {
try {
val jsonStream = """
{ "first": 1 }
{ "second":
{ "count": 2
} }
{ "third": {
"next": "fourth"} }
"""
var number = 1
val completionFuture =
OJsonParser.parseJsonForeach(jsonStream) { json => // callback takes one `OJsonValue`
println(s"#${number}:\n${json.toPretty()}\n")
number += 1
}
completionFuture onComplete {
case Success(_) => println("Parsing stream has successfully finished.")
case Failure(t) => throw t
}
StdIn.readLine()
} finally {
materializer.shutdown
system.terminate
}
}
}
Parsing JSON Stream from Twitter
Okkam JSON and Okkam HTTP are designed to be suitable for comsuming Twitter Stream API.
Although User and Site Streams are going to be closed, a few stream API are still left available. POST statuses/filter
is one of them.
Here, we show an example to print texts of tweets that are collected by statuses/filter
. It includes one advanced use of Okkam HTTP that returns HttpEntity
without the entire body instead of OHttpResponse
.
The following example simply shows texts of tweets which include some keywords of the given arguments:
import okkam.json._
import okkam.json.OJsonValue._
import okkam.http._
import OHttpUrl._
import scala.io.StdIn
import scala.concurrent.Future
import scala.util.{Success, Failure}
import akka.http.scaladsl.model._
import akka.stream.scaladsl._
object TwitterStreamExample {
val osys = OHttpSystem()
import osys._
val twitter = OHttpClient(
OAuth1.KeyPair(
"ABC...", // Consumer Key
"DEF..."), // Consumer Secret
OAuth1.TokenPair(
"GHI...", // Access Token
"JKL...")) // Access Token Secret
def processJsonStream(ent: HttpEntity) = {
val parseFuture =
OJsonParser.parseJsonForeach(ent.dataBytes) { json =>
val text = json / "text" >> ~[String]
println(text)
}
parseFuture onComplete {
case Success(_) => println("\nComplete!")
case Failure(t) => throw t
}
}
def filterTweets(keywords: Seq[String]) = {
val request = OHttpRequest.POST(
https"stream.twitter.com/1.1/statuses/filter.json".withQuery(Seq(
"track" -> keywords.mkString(",")
))
)
val entityFuture =
twitter.makeRequestWithCallback(request) { res =>
if (res.status == StatusCodes.OK)
res.entity
else
throw new RuntimeException("HTTP Error: " + res.status)
}
entityFuture onComplete {
case Success(ent) => processJsonStream(ent)
case Failure(t) => println(t)
}
}
def main(args: Array[String]): Unit = {
try {
filterTweets(args)
StdIn.readLine()
} finally {
osys.shutdown
}
}
}
Since initial bytes of Twitter streams do not arrive immediately after the response, connection can be lost because of undesirable timeout. Then, manual setting of timeouts is required to avoid such kind of problems.
If you want to collect large or infinite numbers of tweets, you need to change max-content-length parameter for a sufficiently large number, e.g. Long.MaxValue
.
Here is an example of manual settings:
val defaultSettings = osys.settings
val newSettings = defaultSettings.copy(
timeouts = defaultSettings.timeouts.copy(
connecting = 30.seconds,
idle = Duration.Inf,
receivingBody = Duration.Inf,
),
maxContentLength = Long.MaxValue
)
These settings are applied by passing newSettings
to makeRequestWithCallback
method as follows:
twitter.makeRequestWithCallback(request, Some(newSettings.toConnectionPoolSettings)) { res =>
If you want only a limited number of tweets, use OJsonParserFlow
directly and combine it with take
method, in Akka Streams' way:
parseFuture
in processJsonStream
is replaced by the following:
val parseFuture =
ent.dataBytes.via(OJsonParserFlow()).take(numberOfTweets).runForeach { json =>
val text = json / "text" >> ~[String]
println(text)
}