Apache Spark - A unified analytics engine for large-scale data processing
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
DataStax Spark Cassandra Connector
MLeap: Deploy ML Pipelines to Production
Base classes to use when writing tests with Spark
The Programming Language Designed For Big Data and AI
GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster
An open-source toolkit for large-scale genomic analysis
Easy access to big things. Library for Apache Spark extending and improving its capabilities
Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.
A library for querying Binlog with Apache Spark structure streaming, for Spark SQL , DataFrames and [MLSQL](https://www.mlsql.tech).
Connectors for Delta Lake
MLeap allows for easily putting Spark ML pipelines into production
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Showcase for IoT Platform Blog
A library based on delta for Spark and MLSQL
Framework to quickly build and maintain Smart Data Lakes
This is a library for SQL optimizing/rewriting including Materialized View rewrite
An Extensible Data Skipping Framework