Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌
A tool for monitoring and tuning Spark jobs for efficiency.
API enabling switching between Spark execution engine and local fast implementation based on Scala collections.
Apache Spark Data Source for ROOT File Format
A RPC framework leveraging Spark RPC module
MLeap: Deploy Spark Pipelines to Production
A Scala feature transformation library for data science and machine learning
Flexible, powerful, simple scala client library for InfluxDB
Spark RDD with Lucene's query capabilities
Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Deriving Spark DataFrame schemas from case classes
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
A Spark datasource for the HadoopOffice library
GeoTrellis for PySpark
Scala-Spark port of https://github.com/bmabey/pyLDAvis for Apache Spark LDA Topic Modelling Visualisation
Spark RDD to read and write from HBase
Generate Scala case class based on Spark DataFrame schema
Data Quality Monitoring Tool
sbt plugin for spark-submit
Write your Spark data to Kafka seamlessly