Spark-SequoiaDB is a library that allows users to read/write data with Spark SQL from/into SequoiaDB collections.

SequoiaDB is a document-oriented NoSQL database and provides a JSON storage model. Spark is a fast and general-purpose cluster computing system.

Spark-SequoiaDB library is used to integrate SequoiaDB and Spark, in order to give users a system that combines the advantages of schema-less storage model with dynamic indexing and Spark cluster.


This library requires Spark 2.0.0, Scala 2.11.8+ and sequoiadb-driver-2.6

Using the library

You can link against this library by putting the following lines in your program:


You can also download the project separately by doing:

git clone
mvn clean install

spark-sequoiadb is built with Scala-2.11 by default. You can use the following command to build with scala-2.11:

mvn -Pscala-2.11 package

or use the following command to build with all scala versions:

sbt/sbt "+ package"

You can load the library into spark-shell by using --jars command line option.
$ bin/spark-shell --jars /Users/sequoiadb/spark-sequoiadb/lib/sequoiadb-driver-2.6.0.jar,/Users/sequoiadb/spark-sequoiadb/target/spark-sequoiadb_2.11-2.6.jar …
Welcome to

      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.0.0

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_80) Type in expressions to have them evaluated. Type :help for more information. …
scala> sqlContext.sql("CREATE temporary table foo ( hello string, rangekey int, key1 int ) using com.sequoiadb.spark OPTIONS ( host 'localhost:11810', collectionspace ‘mycs’, collection ‘mycl’)”)
scala> sqlContext.sql("select * from foo").foreach(println)
scala> sqlContext.sql("CREATE TEMPORARY TABLE jsontable ( hello string, rangekey int, key1 int ) using org.apache.spark.sql.json.DefaultSource options ( path '/Users/sequoiadb/temp/test.json' )")
scala> sqlContext.sql("select * from jsontable").foreach(println)

[this is a new hello message,310,-10]

scala> sqlContext.sql("insert into table foo select * from jsontable")

sqlContext.sql("select * from foo").foreach(println)


[this is a new hello message,310,-10]



