A library that converts between nested DataSets and flatten DataFrames
Spark data source for Salesforce
A Play Module for running Livy Job, that runs code on remote Spark Session.
Google BigQuery support for Spark, SQL, and DataFrames
CSV data source for Spark SQL and DataFrames
A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support.
A Cluster Computing System for Processing Large-Scale Spatial Data
Data Quality Monitoring Tool
Spark data source for Workday
A Variant Caller, Distributed. Apache 2 licensed.
The Lucius REST API based on Spark-Jobserver
Big Spatial Data Processing using Spark
External commands in Java and Scala for ADAM: Genomic Data System. Apache 2 licensed.
Quasar Analytics is a general-purpose compiler for translating data processing and analytics over semi-structured data into efficient plans that run 100% in the target infrastructure.
ECS connector for Apache Spark
Spark Library for Bulk Loading into Cassandra
Use standard scala collections to unit test your Spark code.