API enabling switching between Spark execution engine and local fast implementation based on Scala collections.
Spark based implementation of the Topological Mapper algorithm
Ensemble Learning for Apache Spark 🌲
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Native Spark OSM PBF data source
An independent MapR-DB Connector for Apache Spark that fully utilizes MapR-DB secondary indexes
DataStax Spark Cassandra Connector
Extended datasource implementation for Spark/Hadoop on Aliyun E-MapReduce.
Integrating SMILE and Spark
Google Spreadsheets datasource for SparkSQL and DataFrames
Building Annoy Index on Apache Spark
Deriving Spark DataFrame schemas from case classes
Writing application logic for Spark jobs that can be unit-tested without a SparkContext
A framework for writing Spark 2.x applications in a pretty way
Apache Spark Data Source for ROOT File Format
Executable Apache Spark Tools: Format Converter & SQL Processor
Probabilistic data structures java implementation.
MLeap: Deploy Spark Pipelines to Production
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.