An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Redshift data source for Apache Spark
REST job server for Apache Spark
SANSA RDF Library
Serializers (input and output) for the phone call-related models
Mirror of Apache livy (Incubating)
Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌
Generation of a few sample data sets. For instance, feature set derived from CDR
Library for building data products
A Cluster Computing System for Processing Large-Scale Spatial Data
A general Inference API based on two of the most popular Big Data processing engines: Apache Spark and Apache Flink
The Programming Language Designed For Big Data and AI
Provides KafkaExtract, KafkaLoad and KafkaCommitExecute stages
Calliope is a library integrating Cassandra and Spark framework.
Collection of Spark SQL Helper : udf, udaf, …
A framework for writing Spark 2.x applications in a pretty way