Mirror of Apache Spark
Mirror of Apache Flink
Avro support for Spark, SQL, and DataFrames
Spark library for easy MongoDB access
GeoTrellis is a geographic data processing engine for high performance applications.
Scala client for Amazon Kinesis. Also provides write to Kinesis capability for Apache Spark or Spark Streaming.
An efficient updatable key-value store for Apache Spark
SnappyData - The Spark Database. Stream, Transact, Analyze, Predict in one cluster
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark and Parquet. Apache 2 licensed.
REST job server for Apache Spark
Sparkling Water provides H2O functionality inside Spark cluster
CSV data source for Spark SQL and DataFrames
The missing MatPlotLib for Scala + Spark
Large-scale event processing with Akka Persistence and Apache Spark
Simplifying robust end-to-end machine learning on Apache Spark.
Base classes to use when writing tests with Spark
Redshift data source for Spark