Apache Spark - A unified analytics engine for large-scale data processing
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
DataStax Spark Cassandra Connector
The Programming Language Designed For Big Data and AI
MLeap: Deploy Spark Pipelines to Production
Base classes to use when writing tests with Spark
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster
GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.
Easy access to big things. Library for Apache Spark extending and improving its capabilities
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
An open-source toolkit for large-scale genomic analysis
A library for querying Binlog with Apache Spark structure streaming, for Spark SQL , DataFrames and [MLSQL](https://www.mlsql.tech).
MLeap allows for easily putting Spark ML pipelines into production
A library based on delta for Spark and MLSQL
Showcase for IoT Platform Blog
This is a library for SQL optimizing/rewriting including Materialized View rewrite
Spark Structured Streaming State Tools
This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python. PyJava introduces Apache Arrow as the exchanging data format.
Kafka offset committer for structured streaming query