A Play Module for running Livy Job, that runs code on remote Spark Session.
A refreshing treatment for all quality control ailments. Apache 2 licensed.
API for reading and writing data via various file transfer protocols from Apache Spark.
The Lucius REST API based on Spark-Jobserver
Data Quality Monitoring Tool
Make the ability to show the image and the data of dataframe in notebook.
Native Spark OSM PBF data source
A library for reading public web news results from Bing Custom Search using Spark Streaming.
DIS SDK for SparkStreaming
A library for reading public search results from Reddit using Spark Streaming.
A Scala based Spark Publish/Subscribe NATS Connector
Apache Spark Extensions
Export spark ml SparseVectors as numpy csr matrix
Redshift data source for Spark
Provides the DebeziumTransform stage
Provides the CassandraExtract, CassandraExecute, and CassandraLoad stages
Library for computing tables (tabulations and cross-tabulations) and histogram data in a format amenable for plotting