Impatient fork of Ammonite
General Vectorization Lib for Machine Learning Tools
Scala-based DSLink implementation for Apache Spark
Divolte default Avro schema to use as external dependency
Spark SQS Amazon queue receiver
A library for reading social data from Instagram using Spark Streaming.
ETL Library for Machine Learning - data pipelines, data munging and wrangling
NetFlow data source for Spark SQL and DataFrames
Spark extension for processing large-scale 3D data sets, such as astrophysical or high energy physics data.
JSON schema parser for Apache Spark
Spark-SequoiaDB is a library that allows users to read/write data with Spark SQL from/into SequoiaDB collections.
Spark NetSuite Connector
PageRank in Spark
Data model generator based on Scala case classes
A prototype native MongoDB connector for Apache Spark, using Spark's external datasource API
SparkMeasure is a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.
A Spark datasource for the HadoopCryptoLedger library
Apache Spark test helper functions with pretty error messages