A simple Spark ETL framework that just works 🍺
Basic framework utilities to quickly start writing production ready Apache Spark applications
low-level helpers for Apache Spark libraries and tests
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Bagging Estimator for Apache Spark ML
Scala-Spark port of https://github.com/bmabey/pyLDAvis for Apache Spark LDA Topic Modelling Visualisation
Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
A Scala based Spark Publish/Subscribe NATS Connector
An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Write your Spark data to Kafka seamlessly
Microsoft Machine Learning for Apache Spark
Spark RDD to read and write from HBase
An open source framework for building data analytic applications.
SANSA RDF Library
SANSA Query Layer
A Scala wrapper for Deeplearning4j, inspired by Keras. Scala + DL + Spark + GPUs
C4E, a Scala or Spark library for local and distributed Clustering.
The connector uses the Spark SQL Data Source API to read data from Google BigQuery.